User:Cristoffs/Pl:Import/Guidelines

From OpenStreetMap Wiki
Jump to navigation Jump to search

The import guidelines, along with the Automated Edits code of conduct, should be followed when importing data from external sources into the OpenStreetMap database as they embody many lessons learned throughout the history of OpenStreetMap. Imports should be planned and executed with more care and sensitivity than other edits, because poor imports can have significant impacts on both existing data and local mapping community. The Data Working Group is tasked by the OSMF to detect and stop imports that do not comply with guidelines. So, not following these guidelines may put your account at risk of being blocked.

Imports should not be seen as an alternative to building the mapping community, running mapping parties and generating publicity to engage with more contributors. Of course, all of this is open to discussion, such as on the imports mailing list and this discussion page.

Proces

If you think your city/county/state/country government, a non-profit, or some other organization or person has great data that could be used to improve the quality of OpenStreetMap, here's what you need to know, starting with a quick overview of the process. Most of these areas are expanded in further sections of this page and on related pages.

Krok 1 - Przygotowaniaː

  1. Gain familiarity with the basics of OpenStreetMap, including editing, such as adding details of your neighbourhood.
  2. Review what can go wrong with imports.
  3. Identify data you'd like to import. This might be street centerlines, building outlines, waterways, addresses, etc
  4. Be aware of OpenStreetMap licensing

Krok 2 - Kontakt ze społecznościąː

  1. It is recommended that before any actual work is performed on the import that contact is made with the community to see if there is interest in importing the data. Different geographic areas in OSM have different acceptance levels for imports. The exact same kind of data set might be welcomed in one area and be rejected in another.
  2. Discuss your plan. Email the OSM community to notify them of your plans, including a link to your wiki page. You can do this with an email to (at a minimum) imports@openstreetmap.org, talk-(your country)@openstreetmap.org, and the OSM group specific to the area directly impacted by the import (note that you must join the lists at https://lists.openstreetmap.org/ before mailing to them). This will help gain the benefit of past experiences, which may include having already reviewed the data you're considering for import. Check for local user groups, local chapters, and country-specific mailing lists. You may use osm-community-index to avoid missing active communities using forum, Telegram, slack or other methods for disccussion.
  3. Be prepared to answer questions from the community. Discuss with the community the suitability of each layer for importing. Some data can be readily imported without much difficulty, while others are far more difficult (e.g. street centerlines). Also some are broadly accepted for import, while others haven't had much consensus (e.g. parcel boundaries).
  4. More complex and large-scale imports should be reviewed with the assistance of more technically-oriented and experienced OSM volunteers.
  5. You must not import the data without local buy-in.

Krok 3 - Zatwierdzenie licencjiː

  1. You must obtain proper permissions and licenses to use the data in OSM from the data owner. If the license of the data is not compatible with the OpenStreetMap Open Database License, you can not use the data. Many localities already have progressive open data policies. Others have data policies that are almost open, but have conflicts with issues like prohibitions on commercial use or requirements for attribution. Sometimes, getting permission to use data, even if the existing license might seem prohibitive, is as simple as asking the appropriate authority if they are willing to comply with the terms of the OpenStreetMap Open Database License. See Import/GettingPermission for example emails that touch on important issues. See also ODbL Compatibility for a quick view of some compatible and incompatible licenses of data to import.

Krok 4 - Dokumentacjaː

  1. You must register your permissions and project by adding a line to the table at Import/Catalogue.
  2. You must write a plan for your import in the OSM wiki. Create a wiki page outlining the details of your plan. This plan must include information such as plans for how to convert the data to OSM XML, dividing up the work, how to handle conflation, how to map GIS attributes to OSM tags, how to potentially simplify any data, how you plan to divide up the work, revert plans, changeset size policies, and plans for quality assurance. An example for this can be found at Import/Plan Outline
  3. If required by the data owners, you must add an acknowledgement to the list of Contributors.

Krok 5 - Przegląd importuː

  1. You must subscribe to the imports mailing list and post a review of your import to imports@openstreetmap.org. Don't upload any data until the project has been reviewed first. Note that the imports-us@openstreetmap.org list (or any other local imports list) is not considered authoritative, and is not a substitute for approval from imports@openstreetmap.org.
  2. If possible, prepare the data and make it available for review.

Krok 6 - Przesłanie danychː

  1. Follow your plan.
  2. Track your progress.
  3. Provide updates to the community on your efforts.
  4. Let everyone know when you're done.
  5. You must use a dedicated user account.

Kwestie kluczowe

Omówienie proponowanego importu ze społecznością.

It is important to discuss your proposed import with the community at every step. First of all add an entry to Potential Datasources. Here you can briefly describe what you have found out about the licensing of the data, and the data's accuracy with respect to data we already have. If you need more space, link through to a new wiki page about the data source.

Discuss your import on the imports@openstreetmap.org mailing list and with appropriate local communities. Many local communities have their own wiki pages and/or a Mailing lists. Coordinate with other people with similar plans.

Even if the same or similar import has been discussed before, you should still discuss it with the local community. This means that they are aware of your plans and can raise any issues or clashes before any damage occurs. This is especially true if the data has been available for a long time and has not yet been imported - this does not mean it is acceptable to proceed without discussion with the local community.

Always start by discussing the investigation you have done into licensing and accuracy. If the consensus is that the data doesn't meet our criteria, don't be disappointed. Label it as rejected on the Potential Datasources page, and give the reasons. Documenting such decisions is a helpful contribution in itself. If people are happy with it, move on to discussing implementation of import scripts etc.

Imports related to humanitarian issues, disaster response, or development should consult the HOT (ideally on the HOT Mailing list) as an additional measure.

Dokumentacja importu

If you are going ahead with your import, please create a page about it on the wiki, with all the details. Create an entry on the Import/Catalogue page and link from there to your page. Also link to your page from any local pages about your city or country. The page should have the following details:

  • Datasource accuracy and licensing (also summarised on Potential Datasources)
  • Import/Software you plan to use. Share the source code you are using.
  • Exactly how data will be translated from another format into OSM format
  • How the resulting data will look. Exact tags being used.
  • Link to sample data imported on the test database.
  • User name of the account performing the import, and other details of how the changesets will be tagged

And as the import progresses

  • Link to example data imported on the live database.

Note that changing the scope of the import after the consultation will invalidate the consultation. You cannot have people agree to an import of a data set X and then import X+Y.

Kwestie licencyjne

We are only interested in 'free' data. We must be able to release the data with our OpenStreetMap License. Obviously we are allowed to use public domain data sources, of which there are quite a few, but beyond that, it gets more complicated.

OpenStreetMap moved to the Open Database License in September 2012. Your data must therefore be compatible with that. In addition, you must be able to agree to the Contributor Terms for your import account, which includes provisions to relicense under another free and open license if the community wishes it.

You must not claim an additional copyright for yourself as the importer. For example, if you import public domain data, you must not seek to restrict the use of your imported data. Your import account must not refuse any permissions that were given by the original creators of the data you're importing.

Please also note the details of attribution requirement. We can offer some attribution: we can credit them on our website (not on the homepage, but in the Contributors page here on the wiki, and on www.openstreetmap.org/copyright for very large-scale contributions). We can link information about them in relation to the user account performing import edits, meaning the editing history will allow people to trace the source of the data donation. We can also set their name in the 'source' tags of our underlying data. This is perhaps more prominent, but may be removed by editors doing further mapping work. The credit to the "author" stops there. What we certainly cannot do is require end-users of our data/renderings to give credit to the particular data donor. With this in mind, our attribution may not be sufficient legally speaking and might actually be considered unsatisfactory by the original "authors" of the data.

We often find that data that purports to be available under a compatible license has been ultimately derived from sources that we consider to be non-free. For example, although some geodata is available from Wikipedia under a Creative Commons license it is a widely held belief in OSM that some of the data is simply derived from Google Maps, and therefore not actually available under that stated license. In such cases it is an established community norm to not import data whose provenance is uncertain, regardless of the stated license. Better safe than sorry.

Używanie dedykowanego konta

Create a new user for the import. You must not use your standard OSM user account. The user page for the account should be used to collate data relating to the source and contact details for the import. Furthermore, it means that attribution can often be carried in the account's display name, or in the account's user page, which is better than putting it as a tag, as the user's editing history is a permanent record of the source and doesn't interfere with tags or increase the size of the database as much. For these reasons, creating a dedicated user account is preferable to using a source=* tag. For distributed/community imports, have each person make their own import account, for example "your osm user name"_import. It is not required that each import be done under the same user account.

A great tip from user:Aaron Lidman, if you have an existing gmail account, you can use a email alias and sign up for new OSM account for imports by using youremail+something@gmail.com. For example johnsmith+anytownimport@gmail.com. Gmail will route the OSM emails to your normal email. This will allow you to create multiple OSM import accounts for a singular email address.

Not complying to this rule is one of the reasons that could lead to your account be temporarily blocked by the DWG

Używanie właściwych kluczy ("tagów")

Your import should use tags which are familiar to the OSM community, rather than inventing its own set of tags.

You may have some metadata like the IDs used for your original data. If this metadata will be useful to OSM, then define your own prefix and use that on those metadata tags. The TIGER import for instance uses the "tiger:" prefix. The original ID of a TIGER object is tagged as "tiger:tlid".

However, don't go overboard with metadata. OSM is only interested in what is verifiable. This doesn't include (for example) foreign keys from another database, unless those are absolutely necessary for maintaining the data in future. Your data source may have many many fields, but OSM data elements with many many tags can be difficult to work with. Strike a balance. Figure out (discuss!) what fields the OSM community are interested in.

Don't put data on top of data

Unlike traditional GIS systems, OpenStreetMap has no concept of layers. Data on top of data is just a mess. It's a kind of mess which makes it very difficult for real users to work in the normal OpenStreetMap editors. The Duplicate nodes map reveals imports have not followed this guideline (Rogue TIGER importers caused a lot of this, but sadly there are many more recent messed up imports)

If your data is in a layered traditional GIS format, you'll need to take a different approach. Perhaps merge the layers and calculate the best aggregate tags, but you can always avoid directly importing data, and instead set up a source for users to manually and selectively import from, or a WMS to trace over (like Natural Resources Canada -Toporama)

If you are importing data where there is already some data in OSM, then you need to combine this data in an appropriate way or suppress the import of features with overlap with existing data. Only where you have explicit support, which is highly unusual, should you replace current data with your imported data.

Upraszczanie danych

Shapefiles often include too many details, i.e. more nodes than necessary to represent curves, or more than two nodes representing a perfectly straight line. You'll see this particularly with large landuse areas that have nodes a few meters apart or appear jagged because the resolution isn't fine enough or is too fine. Tools such as Map Shaper can be used to simplify shapefiles that have too many details. Remember to think about how the data looks and can be worked on with in standard community editors (e.g. iD on the website)

Keep server resources in mind

Make sure you don't overload the server when importing large amounts of data. The TIGER import had to be spread out over several months to not kill the central server! Import the data in small installments or otherwise slow down your import scripts. If in doubt, talk to the System Administrators.

Take great care to avoid damaging the database

Take great care to avoid damaging the database and don't leave a messy import and assume that nameless OpenStreetMap contributors working in iD and Potlatch will tirelessly complete your work. JOSM is better at for untangling messy data, but it's still difficult and you should do this work yourself if necessary.

If your import does 'go wrong', or you needed to interrupt an upload half way through, then this should be reverted promptly. If you need help then contact Imports and/or Talk. If you don't know how to revert an import, don't do the import in the first place.

Materiały dodatkowe