Potential datasources

From OpenStreetMap Wiki
(Redirected from Potential Datasources)
Jump to navigation Jump to search

broom

This page is being considered for cleanup. Please discuss this page.

broom

This article or section may contain out-of-date information. The information may no longer be correct, or may no longer have relevance.
If you know about the current state of affairs, please help keep everyone informed by updating this information. (Discussion)

The following is a list of potential datasources. Some of these are already in use, or have been imported fully. Others are under investigation, and some have been rejected (details of rejections are on here too). There are basically two criteria:

  • Licenses - We are only interested in 'free' data. In fact we must be able to release their data with our OpenStreetMap License. This means it's OK to use Public Domain data; for other sources, please check license compatibility. (Overview: Category:Data Licenses)
  • Accuracy/Quality - All GIS data is limited in terms of accuracy and coverage of features. Some datasources are very basic. Because of our wiki style map building approach, this isn't necessarily a problem. We could import it anyway, and improve it later. However there are limits to this reasoning, it does for example require active OSM community members for the area affected. Also in areas where we already have superior coverage through other means, we will not be interested in importing lower quality data. For example a low accuracy map of the main roads of London, is actually useless to us, whereas similar coverage of an unmapped city would be useful.

With these criteria in mind, look at the following, see if there are license issues and if they are useful to us or we can collaborate, and mail the Mailing lists with results.

Listing datasources here is only the first stage in a careful investigation/planning process for imports. If you're interested in performing imports please be aware:

exclamation mark

Imports and automated edits should only be carried out by those with experience and understanding of the way the OpenStreetMap community creates maps, and only with careful planning and consultation with the local community.
See Import/Guidelines and Automated Edits code of conduct for more information. Imports/automated edits which do not follow these guidelines might be reverted!

Datasources which we have imported should also go on the Import/Catalogue. We also have a list of Related Projects


General

Out-of-copyright mapping

See separate page on out-of-copyright maps.

Dispatch centres

Many transportation companies (Taxi, delivery, courier services) are using GPS units in their vehicles either to track them in real time in their dispatching centre or just for logging. It should be in their interest to provide their GPS tracks in order to improve the maps in their areas of operation. Taxis should effectively cover a city fast, while lorries and such are probably better for longer distances. Already using Ecourier tracks.

Global Coverage

MapSwipe project

MapSwipe projects is part of the OpenStreetMap community. The goal is high-quality geographical data, freely accessible and available to everyone. OSM’s reciprocal license protects the data from being appropriated by services that do not share back to OSM.

MapSwipe is released under a "liberal" non-reciprocal license (Creative Commons Attribution). This only requires that users acknowledge the source. You can do whatever you want with the data, just make sure to credit the MapSwipe contributors.

Project is managed by Heidelberg Institute for Geoinformation Technology at Heidelberg University (HeiGIT gGmbH)

World Port Index (position and data of harbours worldwide)

Perry-Castañeda Library Map Collection (contains Army Map Service Topographic Map Series)

The CIA World DataBank

The CIA World DataBank II is a collection of world map data, consisting of vector descriptions of land outlines, rivers, and political boundaries. It was created by the U.S. government in 1986. Hence it should be under public domain, since the files haven't changed since. There's roughly 1.8 million shape points in Europe and N.A.

GPX files can be created from the CIA World DataBank with the Perl script at

There is now an 0.5-compatible import script for OSM at [1]. Please document your upload progress on WikiProject_Import_WDB.

GEOnet Names Server

Main page GEOnet Names Server

This database contains 5.5 million geo features (mainly place names). It's updated monthly by the agency.

'about' page states

"There are no licensing requirements or restrictions in place for the use of the GNS data. However, we recommend using the following citation to identify the GNS as a source ...

The data is messy and with spelling mistakes and strange historical name entries. Accuracy is also poor. We have imported GNS data in some countries, usually where mapping is not progressing massively anyway. This hasn't been carried out systematically at a global level.

Geonames.org (Rejected)

Main page: Geonames

This is a "wiki" style name server at geonames.org which claims to release data with an open license, however it uses various unfree datasources, and pinpoints locations on an unfree google map. It therefore does not meet the strict requirements of OSM

We do use the dataset for Search however

The Map Library

The Map Library is a source of public domain basic map data concerning administrative boundaries in developing countries. The format of available data is Map Maker DRA, ESRI Shape file or MapInfo MIF.

In addition to the Map Library data they also are hosting a copy of version 0.9 of the Global Administrative Areas database of the BioGeoMancer project, but:

Note: The administrative areas dataset is licensed under a noncommercial creative commons license, so it cannot be used in OpenStreetMap. [2]

OpenGeoDB

OpenGeoDB was started as a German project for places within Germany, extended to Austria, Switzerland, Liechtenstein and Belgium by now. Initial data was derived from GEOnet Names Server, but has been extended by postal area codes, population, governmental structure etc.

The data is public domain and can be used without any limitations.

The homepage is http://opengeodb.de

Dumps are released on sourceforge: http://sourceforge.net/projects/opengeodb/

Data can be edited and downloaded any time via http://fa-technik.adfc.de/code/opengeodb.pl

The Berkeley BioGeo gData set

See Berkeley_BioGeo_data for more details of this many-country database. It concerns ESRI compatible free data, in which free needs to be examined. It is presumed that ESRI compatible files can be converted into OSM formats, even when ESRI itself is a company. Is a dataformat leading to a problem with licenses? Orwall 24 Jan 2008

This data set uses a blanket CC-BY-NC-SA. The NC part makes it incompatible with OSM. Individual parts of the database have additional restrictions, and there may be an overall database right being infringed. We may need to look at purging what has already been imported. Chriscf 09:10, 21 October 2008 (UTC)
I'll investigate the part containing global administrative areas (but called GADM). It seems to be very useful, because administrative boundaries are missing for large parts of the world. One table in the geodatabase (MS Access) contains information on copyright for every country. I'll try to contact one of the persons at Berkeley, and ask if they agree to lift the NC restriction for OSM. --Fsteggink 16:32, 14 February 2009 (UTC)

Shoreline databases

There are several freely available datasets for coastlines. For paths running along cliff tops you wouldn't want to trust your life to any of this data - but for the purposes of giving you a plausible landmass to draw maps on they're all useful. See Proposed features/Coastline

Prototype Global Shoreline

See the main page on PGS

Our script for importing into the OSM database is "Almien coastlines (PGS)"

GSHHS - Global Self-consistent Hierarchical High-resolution Shoreline

http://www.ngdc.noaa.gov/mgg/shorelines/gshhs.html

Import script is at Almien coastlines (GSHHS), but is not being used in favour of PGS.

GSHHS is a database of lakes and shorelines of the world, resolution of 200m produced in 1996. Delivered with source, written in C, and the PD database. They are also available as shapefiles. Tends to give more accurate coastlines than VMAP0 outside of the US (for some reason VMAP0 seems more accurate for the US). There are issues with inland water (such as the great lakes) though.

(The above info appears slightly incorrect: according to the README the last update was in 2004. It was generated from the CIA WDBII and the WVS).

VMAP0

US department of defence vector map (contains far more than just coastlines, but can be rather "coarse". Also known as Digital Chart of the World. See Wikipedia.

World Vector Shoreline

http://www.ngdc.noaa.gov/mgg/coast/wvs.html

Claims, 1:250,000 scale use, seems pretty accurate at least for the GB and Ireland

EVS Islands (Data Dependent)

Based upon different landsat images, some comparisons are made against PGS, and there are also maps traced from Google Earth imagery. He makes very beautiful maps of all the islands of the world. http://evs-islands.blogspot.com

SWBD - Shuttle radar topography mission Water Body Data.

ftp://e0srp01u.ecs.nasa.gov/srtm/version2

Derived from the SRTM digital elevation model. Newer than the rest, potentially more accurate.

From the README: SRTM data are distributed in two levels: SRTM1 (for the U.S. and its territories and possessions) with data sampled at one arc-second intervals in latitude and longitude, and SRTM3 (for the world) sampled at three arc-seconds. Three arc-second data are generated by three by three averaging of the one arc-second samples.

For reference: one arc-second at the equator is 30metres. For the US this means around 18m accuracy, but for elsewhere it's pretty bad. At the latitude of Brussels it would be about 60metres accuracy. For any latitude l, 1 arc-second of longitude is about 30 metres multiplied by cos(l).

The SWBD data is derived from this and claims one-arc second accuracy worldwide, though only lakes greater than 200m are included.

Converting SRTM into OSM format can be done with Srtm2Osm.

Earth Observing Laboratory NCAR.

EOL provides state-of-the-art atmospheric observing systems and support services to the university-based research community for climate and weather research. Most of it is free(?)

  • Unless I misunderstand the OSM license, this quote from Prohibited Uses on the NCAR terms page would seem to indicate data cannot be used without a special wavier being granted by them for the possibility of their data being used in a work that is sold. Other conditions may apply too. Data from here should wait until the new OSM licensing is decided so that we don't have to risk them not agreeing to the new requirements after data under the old agreement has been imported.

"Use that is inconsistent with UCAR's non-profit status and mission. UCAR is a non-profit, tax-exempt organization and, as such, is subject to specific federal, state, and local laws regarding sources of income, political activities, use of property, and similar matters. As a result, commercial use of this Site or the Materials for non-UCAR purposes is generally prohibited. Requests for exception to this should be directed to ipinfo@ucar.edu." Rjhawkin 09:35, 18 April 2009 (UTC)

ICEDS European Data Server

The ICEDS server is hosted and run from UCL with the aim of serving high-resolution global and continental data. One of the nicest things about the site is its ease of use - you view an area and then click a button to download the relevant data (SRTM, Geological Maps, Landsat 5 & 7, etc). The site is also an exercise in Open Standards and is fully compliant with the OGC's WMS, WFS standards - meaning that a variety of external data sources (Weather and bathymetry for example) can be overlaid with the click of a button (Why not OSM data?). In common with OSM it is also based on an Open source principle - Map Server on Apache on Linux.

Two things I particularly like about the site:

  • Extensive use of Java-Script embeds a lot of features onto the page. Try out the "Flicker" and "Swipe" tools and the drop down menus for downloading SRTM data.
  • The near realtime (15 minute delay I think) Global Clouds layer.

hostip.info Geolocation data

http://www.hostip.info/ has a geolocation database. Geolocation means mapping from IP address to location. It's a neat trick, although the accuracy is obviously quite variable (try it on the site there) We could try to use this to jump to the user's home town, rather than the default world map, when a visitor enters the openstreetmap.org homepage. Might fun, or it might just be confusing when it gets it wrong.

Ip Global Positioning (Unfree)

http://www.ipgp.net will show a map with the location of an IP address. In some cases it will show the location of Internet Service Provider, but you can still make an idea.

DAFIF

Defense Aeronautical Flight Information File. Published by the US Department of Defense, lists pretty much every airport runway worldwide. Apparently, they're going to be removing the dataset from public domain in October 2006. This is used to populate the scenery databases for Flightgear flight simulator and X-Plane. Whilst the dataset is still being kept updated in the public domain, I'll work out an import to Openstreetmap of the runways and taxiways. Welshie 09:51, 25 Jun 2006 (UTC)

ourairports.com

ourairports.com is a replacement for some of the DAFIF data, put into in the public domain.

Other (partial) DAFIF replacements

APRS data

Many Amateur radio operators have GPS receivers connected to packet radio transmitters and transmit their location in a standard format known as APRS (including fixed stations, weather stations, and mobile stations) on standard frequencies from which the data can be picked up and is injected into a internet infrastructure APRS-IS. The infrastructure includes databases APRSWorld, OpenAPRS, and a substantial supporting body of open source software (which would make it simple to set up a database to selectively mine internet feeds for data coming from mobile stations that are likely to be traveling on roads). APRS data does have some intrinsic limitations - positions are usually transmitted on 5-10 minute intervals, and a widely used non-compressed format limits the precision of latitude and longitude to 0.01 minute (see the APRS protocol specification for details). Given the origin of APRS data as Amateur radio transmissions, they are likely to be in the public domain.

  • APRS is more a use specific data point. A potential problem is that some APRS stations are not following a land based track. I know of several APRS devices mounted to weather balloons, hot air balloons, regular and RC aircraft, or boats. There is no requirement to indicate to a third party what use your APRS station is being put to when it transmits a packet.Rjhawkin 09:52, 18 April 2009 (UTC)

Placeopedia (rejected)

The guys who gave us Pledgebank made placeopedia as well, it's a way to use Google maps to geo reference wikipedia articles. The database is held separately from the wikipedia one, and can be accessed with rss, xml and kml feeds from their site.

It says it is available under a Creative Commons license, but not which one.

This has been discussed on the mailing list, and the consensus is that Placeopedia is a derivative work of Google Maps. Although it's very unlikely to be challenged by Google Maps / Teleatlas, it's not suitable for use in OpenStreetMap (https://lists.openstreetmap.org/pipermail/talk/2005-October/001212.html is one of many relevant posts, and a good summary of the argument).

SALB

"The Second Administrative Level Boundaries (SALB) dataset is a dataset aiming at improving the availability of information about administrative boundaries down to the second subnational level. The SALB dataset forms part of the UN geographic database and was developed in the context of the United Nations Geographic Information Working Group (UNGIWG)."

SALB


True Marble imagery (Unearthed Outdoors)

Full color global imagery, built from the best available Landsat scenes which have been chosen to optimize vegetation quality and to reduce the cloud cover. These orthorectified scenes were fed into proprietary, advanced, color adjustment algorithms to produce true color imagery while reducing atmospheric haze. This imagery was optimized to produce the most natural color while maintaining a high local contrast and dynamic range, resulting in a significant improvement over similar products.

A reduced resolution of the True Marble™, Global True Marble™, is available for free download. This dataset can be used to preview pending purchases, or for any other usage. We only ask that copyright be attributed to Unearthed Outdoors when reproduced.

Licensed under a Creative Commons Attribution 3.0 United States License.

http://www.unearthedoutdoors.net/global_data/true_marble/download

World Database on Protected Areas (WDPA) (Rejected)

http://www.wdpa.org/Default.aspx

"The World Database on Protected Areas is a foundation dataset for conservation decision making. It contains crucial information from national governments, non-governmental organizations, academic institutions, international biodiversity convention secretariats and many others. It is used for ecological gap analysis, environmental impact analysis and is increasingly used for private sector decision-making...."

WDPA data use email request

Ramsar Wetland Sites

The data included in the database derives from the Ramsar Information Sheet, the Ramsar National Report and/or from Administrative Authority correspondence provided by Contracting Parties. This includes information on wetland types, land uses, threats, hydrological values of the sites etc. The Ramsar Sites Database is primarily a tool to look at Ramsar Sites across geographic and thematic boundaries, useful and necessary for maintaining an overview of a global network of well over 1700 internationally important wetlands from 158 countries.

OpenCellID cell

http://www.opencellid.org/
http://wiki.opencellid.org

OpenCellID is the biggest open source data for CellIDs (GPS positions of cell towers), and provides an API to both
- gather data of newly discovered cell towers
- locate a cell phone or tracking device using cell tower positions

Source code is provided under a GPL license, while data is collected under a Creative Commons Share Alike license V3.0

Plans exist to support Wifi/Bluetooth positioning.

see also: OpenCellID

OpenBmap cell and wifi access points

http://realtimeblog.free.fr/

openBmap is a free and open map of wireless communicating objects (e.g. cellular antenna, Wi-Fi, Bluetooth). It provides tools to mutualize data, create and access this map.

Data is available under Creative Commons Attribution-Share Alike 3.0 Unported license, and is currently provided as a static download, updated regularly.

The site owners intend to create an api, which could allow regular, scripted updates to the data. Alternatively, it could be suggested that they adopt their site/software, to upload data direct to osm

see also: openBmap

CODATA Roads Data Catalog

A catalog of roads data sets, globally, compiled by the gRoads project. There are hundreds of data sets listed here (many probably already investigated), under all sorts of licensing schemes. There are certainly data sets new to the OSM community that are worth investigating.

The interim PDF is posted at Media:CODATA_Roads_Data_Catalog_v1.pdf

There are possible plans to post this database to the web, with query front end.

United Nations Laws Of the Sea

The UN DOALOS have a database over all maritime border claims. There is no apparent license on the data and this is the borders definitions. But the database is a database of different PDF files with coordinates of baselines and territorial borders. The text has to be parsed by hand and other maritime borders needs to be calculated from the baseline (see maritime borders). Data is to be tagged with source=UNCLOS

Freebase

Though the two projects have significant overlap and are both Creative Commons, they are [ http://wiki.freebase.com/wiki/License_compatibility#Compatible_licenses_for_structured_data incompatible]. Freebase.com is licenced under CC-BY. The biggest hurdle in using OSM data within Freebase has been the OSM share-alike license. All the data in Freebase is available under a CC-BY (attribution only) license, allowing anyone to use the data in any way they want as along as they acknowledge Freebase. The main difference between the licenses is that the Freebase CC-BY license does not impose a share-alike requirement on users of the data.

That said, the OSM community is currently voting on an updated license which could make collaboration between the two data sets much easier in the future (for more information see: Open_Database_License)

In spite of the licence conflict, simply linking these resources together wouldn't be a problem.

we could connect relations to locations

  • /en/buenos_aires
  • /relation/369450

and connect osm'samenity nodes (or 'points of interest'). both databases for example, have a mount sinai hospital-

  • osm @ /node/42606735
  • freebase @ /guid/9202a8c04000641f800000000039b682

List of Marine Lights

NGA List of lights is a database of lights, radio, and fog signals. It contains almost all lights on the world and is published by the US Nationional Geospatiol Intelligency Agency which may have a useful copyright permission.

The database could be querying online and return information in HTML, JSON, XML and CSV.
Information is also published as PDF documents and we have already parsed them into an OSM appropriate format.

The tags used for imported are planned to be the OpenSeaMap proposed seamark:*=* ones.

EC-JRC built-up areas / density of building from Bing

See EC-JRC built-up areas from_Bing

This is partially a follow-up to this thread: https://lists.openstreetmap.org/pipermail/imports/2011-June/000987.html

We (the Joint Research Centre of the European Commission) have developed a set of scripted tools to extract built-up areas, built-up density, and location of human settlements from (very) high resolution satellite imagery. The result is quite reliable and our methodology has been warmly received by the remote sensing community. Part of the methodology is described in this paper: http://ieeexplore.ieee.org/search/srchabstract.jsp?tp=&arnumber=5764726

The Gateway to Astronaut Photography of Earth

http://eol.jsc.nasa.gov/

Photographes taken by astronauts from International Space Station. There is about 25 000 000 photos.

Conditions for Use

source=Image courtesy of the Image Science & Analysis Laboratory, NASA Johnson Space Center - MissionID-RollID-FrameID

US Dept. of State, International Boundaries

From the Office of the Geographer and Global Issues (INR/GGI), we are free to use these international boundary lines (attribute = 1), other lines of international separation (disputed lines; attribute = 3) and other lines of separation (attribute = 7.) They reflect US govt. policy and thus not necessarily de facto control. This data set is in the public domain; no restrictions on its use.

The most recent edition (December 2015: "LSIB6b") is available at http://www.data.gov/ by searching for large scale international boundaries.


The lines are produced and regularly updated by geographers at State and colleagues from other agencies based on imagery, old treaties/maps, and other sources. For most of Europe and the U.S., the lines are not particularly accurate; accurate data for these areas are widely available via other sources. This "Large Scale International Boundary (LSIB)" line data as well as polygons are also available at http://geocommons.com/search?model=&query=LSIB, though there may be some issues with the processing in GeoCommons.

GLIMS Glacier Database

Main page GLIMS Glacier Database

A database of glacier outlines with a global (although incomplete) coverage.

Hosted and maintained by the NSIDC, licensed as public domain.

Natural Earth

See their website.

SimpleGeo

A POI dataset (60M, global coverage) that was opensourced after SimpleGeo was acquired by another company.

Open Charge Map

This is a worldwide database of electric vehicle charging stations. It is mostly open source but with some imported data from third-parties but the data is segregated. It has about 27,000 locations currently listed. The data is mostly community sourced and maintained.

See OCM Web Site

See Open Charge Map for further details.

Esri Community Maps AOIs

This is a worldwide dataset with vector feature data for special areas of interest (AOIs) that have been contributed using the Esri Community Maps Editor app. The data that is reviewed and accepted by Esri is included in Esri basemaps and also made available to export for use in other maps. The dataset currently includes about 150,000 features of various types, including buildings, streets, parking lots, landscape areas, and points of interest. The data is provided under a CC BY license with explicit waiver for use in OpenStreetMap. Selected layers (e.g. buildings) could be exported from the dataset and made available for use in OSM editing to save time in re-creating these features from scratch.

See Esri Community Maps AOIs for more details.

Allen Coral Atlas

See announcement at https://allencoralatlas.org/blog/mapping-the-worlds-coral-reefs/ I am asking them to consider allowing export of their data, or a subset of it, under and OSM-compatible licemse. Will update with response.

They have said that releasing their data with a OSM-comptible license is "beyond our scope". (2021-09-29)


Local data

This content moved to new page on January 29, 2022.
See Potential Datasources/Local data


References