Talk:AI-Assisted Road Tracing
- 1 FAQ's
- 1.1 What does AI mean?
- 1.2 Are there limitations with AI?
- 1.3 Are you adding roads directly to OSM without humans validating?
- 1.4 Why are you using iD Editor versus JOSM?
- 1.5 What Imagery are you using?
- 1.6 Can we share DG imagery?
- 1.7 Can you share DG imagery?
- 1.8 Facebook License with Digital Globe
- 1.9 Is the Imagery free of Cloud Cover?
- 1.10 Who is behind firstname.lastname@example.org?
- 1.11 Can the Chageset be more specific?
- 1.12 Can you Publish Source Code for ML?
- 1.13 Can you Publish Source Code for internal tools?
- 1.14 Can Facebook Share other data?
- 1.15 Why did you choose Thailand?
- 1.16 Are you going to Map other countries?
- 1.17 What is the Overall Process?
- 2 Discussion
What does AI mean?
- Artificial Intelligence (AI) is training a computer to do things that require intelligence when done by humans. In this case it involves creating training data for an area by tracing out satellite imagery the same way you would when editing and feeding it to a computer. The output of this is road masks. Here is an example prediction result using deep learning for an area in Thailand.
Are there limitations with AI?
- Yes, AI and algorithms still have a long way to go to figure things such as assigning tags according to OSM region specific guidelines, figure out the correct OSM relation, add bridges over waterbodies or figure out connections with crossing railways among many other things. Currently all of this is done by our mappers manually.
Are you adding roads directly to OSM without humans validating?
- Every single road from the AI output will be added only after multiple rounds of human validation. After some processing of the road masks shown above we end up with a .osm file. We then treat this like remote tracing. Using iD editor and/or JOSM mappers make sure that merging with the current data has happened smoothly. They tag each road appropriately, connect them to current OSM edits, fix alignment, when necessary, add sensible changeset comments/notes and resolve conflicts before finally saving where it is merged with the current OSM data.
Why are you using iD Editor versus JOSM?
- We know JOSM is efficient at what we are trying to do. In fact, we started primarily with JOSM. This said, to make our internal process more efficient, our engineers improved the iD editor so our version has comparable functionality as JOSM. We are working on making this public so we can share it with the community.
What Imagery are you using?
- We are using DigitalGlobe's Vivid+ which is high-resolution (50cm / pixel, or zoom 18) color-corrected and cloud-free.
- While our license does not currently permit this. DG as noted in this forum is looking at options to publish imagery for OSM editing similar to the imagery that is being used by us.
- Our license does not currently permit this. However, to reiterate what Kevin from Digital Globe said, earlier on this thread, this will not be an issue in a few weeks when all of OSM will get an imagery refresh.
Facebook License with Digital Globe
- This license is necessary to comply with our obligations to Digital Globe, and we think the benefits of supplying this data outweigh the licensing restraints. The data will be under ODBL once it has been contributed to OSM in compliance with the terms.
- There is no contradiction between the OSM terms and the Facebook license with Digital Globe. Our license terms do not apply to data once contributed to OSM and termination of the agreement with Digital Globe would not impact the data that has already been contributed to OSM.
Is the Imagery free of Cloud Cover?
- The intended output of +Vivid is cloud free, and in most cases, DG meets this. There are some highly cloudy parts of the world where it’s unavoidable. On average though, this product maintains less than 1% cloud cover.
Who is behind email@example.com?
- There are 4 people behind this email. This helps us collaborate as a team and is the most efficient way for everyone to be on the same page. More people also ensures that no emails are missed. I (Drishtie Patel) as the Program Manager primarily watch this email and respond to queries. The other 3 people are our Engineers namely, Ming Gao and Saikat Basu and the Product Manager for Maps at Facebook, Sadi Khan.
Can the Chageset be more specific?
- Yes we can absolutely change that to import=facebook-ai-____. We don't specifically have versions but can include the area or task number for example. Open to suggestions.
- As for adding more details to our edits: there were no changes to current OSM data in this sample and we plan to add notes when we upload as would normally happen when using the Tasking Manager and iD. The sample shared has not been uploaded yet.
Can you Publish Source Code for ML?
- We cannot share the source code for Machine Learning at this time, but we can and will share our internal tools.
Can you Publish Source Code for internal tools?
- Yes. We are currently using a version of iD Editor and the HOT Tasking Manager. We are absolutely going to share our tooling. We are currently working on creating a Github where we will post soon.
- This is clear violation of privacy so we cannot share this kind of information. We are exploring ways to crowdsource road names for OSM and will share that as we have a more concrete plan.
Why did you choose Thailand?
- Facebook has a high number of users in this country and we would like to improve that map for this community. It is also our mission as a company is to make the world more open and connected and one way we can do is by filling in the missing gaps on the map. We also saw a strong OSM community that we could learn from and engage with while we refine our process for mapping. We are hoping for community feedback as we move forward so we can contribute high quality edits.
Are you going to Map other countries?
- Yes. We are moving slowly to focus efforts on one country at a time to make sure our process is accepted by the OSM community out of respect for the process.
What is the Overall Process?
- PHASE 1 - Generating Road Masks
- Training data is created by editors. Example
- Training data used by engineers in ML to produce road masks. Example
- Road masks are processed to remove low confidence predictions and add connections between short breaks.
- PHASE 2 - Creating Road Vectors (.osm files)
- 1. Road masks are then processed using an algorithm from a black and white image to a vector format specifically a .osm file. Example
- 2. Our .osm file is then merged with current OSM changesets. During this process the following things take place
- All current OSM data for an area is merged, keeping full history.
- No changes are made to the current OSM data.
- Duplicated roads are deleted from the ML generated data. Current OSM data is always taken over facebook generated data.
- New ML identified roads are connected to current OSM roads.
- 3. We then have a locally stored file containing both our generated roads and the current OSM data set for a specific area.
- PHASE 3 - Human Validation
- 1. We use the Tasking manager to divide up tasks and create a flow of editing and validation for the Facebook mapping team. Example
- 2. The Mapper picks a task and loads the locally stored .osm file generated from the post processing steps outlined above. Example
- 3. The task is opened up in our enhanced version of iD. (This includes both the roads generated by us and the current OSM data). Example
- 4. Our generated roads are highlighted in a different color so editors can inspect it for issues like crossing highways, disconnected roads, incorrect intersections, short road stubs, road types, etc., and fix these issues manually one by one, using Digital Globe satellite imagery background.
- 5. We do not delete current OSM data, but make typical OSM editing changes where necessary to ensure high quality of the data (more precisely follow imagery, consistent tagging, etc).
- 6. Our modified iD tool is equipped with data validation functionalities similar to JOSM and osmlint. This allows us to check for quality and conflicts with current OSM data. Some examples of what we check for include:
- 7. In case of conflicts between our newly created roads and other OSM editor's mapping, our mapper will almost always choose “keep their edits”. If we think our newly added roads are better aligned with the latest version of DG satellite imagery, we'll contact the other mapper offline to reach resolution. If we end up changing other mapper's edits, we will leave detailed notes to explain why.
- 8. Until all errors are fixed mappers will not be able to save. Here is an example of a highlighted issue prompting the mapper to fix it.
- 9. After fixing all the issues detected, our mapper clicks the “Save Local” button in the iD tool to save their edited roads locally for validation.
- 10. A second person, the validator then goes into the same task to verify that the data looks correct, makes necessary changes as needed and clicks “Submit” to finally upload the tile to OSM.
Why we changed our title for this wiki
Changed from “AI-Based Country Scale Road Import” to “AI- Assisted Road Tracing”
Since posting this wiki we have received quite a bit of feedback from the forums, e-mail and conversations with long time OSM contributors. Based on this feedback we have decided to go back and make some changes to help clarify a few things.
- While using AI has become common practice in the tech world, it is fairly new to OSM and there aren't any guides to adhere to so we decided to call it an import. However we wanted to clarify that we have created a process that is basically like remote tracing using satellite imagery, where most of our roads are generated using deep learning.
- It is important to note we are creating the data ourselves, processing it and then following the same process you normally would when using a Tasking Manager to edit with iD or JOSM.
- Multiple people go through each node and way making sure to tag each road appropriately, connect them to current OSM edits, fix alignment when necessary, add sensible changeset comments/notes and resolve conflicts before finally saving where it is merged with the current OSM data.
- We are working with just one country at a time and our edits will happen slowly by grouping areas into small regions so we complete one before moving to the next. We plan to spend the next few months in Thailand.
- Here is an example of how we plan to divide the tasks for the country. The colors indicate the current density of roads going from blue to red for high density areas. Thailand Road Density by Task
- As promised we will also be sharing sample data to the import list.
Who is "We"
- usernames had been hidden inside links. Expanded them so search will return them. --Stephankn (talk) 06:48, 21 March 2017 (UTC)
Discussion with the community
It's worth mentioning that there is an active OSM forum for Thailand at https://forum.openstreetmap.org/viewforum.php?id=46 , and at the very least I'd expect any import of "detected" roads in that area to be discussed with them before it takes place. Personally I'm somewhat sceptical that "we have been able to train models accurate enough to detect roads from satellite imagery", since the evidence so far suggests that you absolutely have not been able to do that (see https://forum.openstreetmap.org/viewtopic.php?id=55685 et al). --SomeoneElse (talk) 00:01, 17 February 2017 (UTC)
The consultation with the community should probably not happen with the local community first and imports@ second, but both at the same time, or perhaps iron out the basic issues on imports@ before proposing something to the local community that may not happen. Pnorman (talk) 11:27, 18 February 2017 (UTC)
We really value feedback, so thank you for highlighting some points we needed to clarify. Please take a look at our edits to our original post. To clarify further, we are not directly uploading machine generated roads; it's AI-assisted human mapping and we have conducted extensive training with our mappers to make sure each edit is validated by multiple different people to ensure quality.
We've been communicating via email over the past 4 months with specific local users in Thailand to gather feedback and local knowledge and to share our initial results. There are a number of people behind this email firstname.lastname@example.org, and we invite all feedback and questions. —Preceding unsigned comment added by DrishT (talk • contribs) 17:21, 17 February 2017
- Yes, but the last one was supposed to be AI-assisted mapping. It's not clear to me that you've established where the last one went wrong and what measures there are to ensure it won't happen again. p.s., it helps if you sign your posts with four ~ so people can tell who's written what. Pnorman (talk) 11:13, 18 February 2017 (UTC)
Since our last upload we have improved our process based on community feedback. We're now working with a larger team of highly-trained mappers and have added steps to the validation process. We're confident this round will be much smoother! Again, we welcome your suggestions at email@example.com. We have a dedicated team behind the email who will respond within 24 hours. --DrishT (talk) 17:38, 18 February 2017 (UTC)
- Thank you, Paul. We provided these comments so community members could easily see our edits. Each mapper will provide more detailed comments as they edit following this guide: Good Changeset Comments--DrishT (talk) 17:36, 18 February 2017 (UTC)
Lack of discussion
This page was created on the 14 February and claimed that
- Following the imports guidelines, these imports will be discussed first in the country specific mailing list, and to get more feedback it will also be shared with the import list.
Yet, 2 days later and without any discussion on the imports list OR the Thailand user forum, someone proceeded to import data in Thailand: https://www.openstreetmap.org/user/VLD004/history I'm very sorry to see that this continues the "bad faith" track record from last time with its lack of transparency (how hard can it be to put a note in the user profile explaining who they are and who they are working for) and a blatant disregard for the community (you promise to discuss things ahead and then break your own promise two days later). Is what we're seeing here just incompetence, or a deliberate attempt at misleading the community? You must have known, after things failed so badly last time, that more diligence is required in dealing with OSM. --Frederik Ramm (talk) 12:17, 24 February 2017 (UTC)
- Hi Frederick, Prior to importing data, we shared proposed edits with Thailand OSM community members via email for feedback. Given your expertise and experience in OSM, we'd like to speak to you, and your OSM Data Working Group partners, in more detail about our approach and process. We're new to this community and want your direct feedback and suggestions. Can you please share some days and times over the next couple of weeks that work for your group? We'll set up a meeting via VC if we're unable to all be in the same place. Thanks again, Drish --DrishT (talk) 01:27, 25 February 2017 (UTC)
- You can and should email the imports mailing list? That's a good place to speak to relevant people. If you're new to OSM, perhaps you shouldn't start with an import. Maybe start with regular OSM contribution, and when you have learned the ropes, you can do an import, which is a more advanced action Rorym (talk) 14:55, 25 February 2017 (UTC)
- Hi Andy, thank you for the feedback. Yes, we are aware about posting to the import list and will do so before making any edits.
import=yes and source=digitalglobe
On Oct. 17 you "clarified" that you would not tag changesets but individual elements with source=digitalglobe and import=yes . This differs from the previous statements where you said you would tag changesets this way.
I think you should tag changesets with "import=yes". I would advise against using import=yes on elements, because it is unclear how future mapping should deal with the tag (Should I remove it when I adjust your element from local knowledge or better imagery? Should I leave it alone?). --Gormo (talk) 13:06, 18 October 2017 (UTC)
- (Crosspost reference) This was also mentioned by Nakaner on the imports mailing list: https://lists.openstreetmap.org/pipermail/imports/2017-October/005188.html . --Gormo (talk) 13:09, 18 October 2017 (UTC)
- I believe it would be desirable to have the information, that these are not human traced things but coming from an AI, somewhere in the changeset tags. The source is not just "digitalglobe", digitalglobe only provided the imagery. Reason is, that usually an aerial imagery reference in the source tag implies: "traced by the user". --Dieterdreist (talk) 15:08, 18 October 2017 (UTC)