Open Data License/Alteration Files - Guideline
These are community guidelines, so please put your comments on the discussion page or inline in this page.
Background: What's the problem?
ODbL clause 4.6b, states (in part):
If You Publicly Use a Derivative Database or a Produced Work from a Derivative Database, You must also offer to recipients of the Derivative Database or Produced Work a copy in a machine readable form of:
- a. The entire Derivative Database; or
- b. A file containing all of the alterations made to the Database or the method of making the alterations to the Database (such as an algorithm), including any additional Contents, that make up all the differences between the Database and the Derivative Database.
What does this offering need to look like? Is it an active offering or an offering on request? We've heard discussions that one could "encode" attributes so that the value for the recipient is almost zero.
This is at the proposal stage in our process - it may change after discussion by the OpenStreetMap community
Open Issues, Use Cases and Discussion
Any text here is not part of the formal or proposed guideline!
There are a number of things left open by the license text that would be good to clarify:
- What form of offer is required?
- The license specifically says in 4.6: The Derivative Database (under a.) or alteration file (under b.) must be available at no more than a reasonable production cost for physical distributions and free of charge if distributed over the internet. - this would mean all internet distributions would have to be free of charge no matter what volume of data is involved. The likely side effect of this is that for high volume data sets people will opt for physical distribution since they can at least charge reasonable costs then although technically internet distribution would in most cases be preferred by both sides.
The problem of encoding is to some extent already addressed by 4.7 in the license. Beyond that it would be reasonable to
- have a term similar to the GPL term of preferred form of the work for making modifications requiring any data to be made available in such form.
- require distribution in an open file format.
both of these would equally apply to the distribution of the full derivative database of course.
The possibility of making available an algorithm is most vague in the license. In particular the following is unclear:
- Would it be fine to make available a binary blob of the algorithm for some specific computer platform designed to specifically reproduce this particular derivative database from one particular version of the original database? Even if making available the source code is required a malicious data user might be inclined to offer only an inefficient version of the code making reproduction of the derivative database very costly for anyone.
- What kind of documentation and readiness for use is required? Some baseline what level of support the recipient can expect and the provider has to expect to give would be good.
- What license terms may be imposed on use of the algorithm? Theoretically you could provide an algorithm but forbid any use of it (or impose some arbitrary conditions).
Use case: OSM + PD data
One probably quite common use case is merging OSM with public domain data from other sources. Even though the other data is freely available share-alike applies if the combination of the two data sets is non-trivial. If this is high volume data it might be preferred, possibly by both sides - to make available the algorithm instead of distributing large data sets.