Audio mapping

From OpenStreetMap Wiki
Jump to navigation Jump to search

Audio mapping (also called voice mapping) is a mapping technique; a way of recording data while out surveying. It is an alternative to Photo mapping and other mapping techniques. There is a huge range of variety in audio mapping techniques.

Why audio mapping

  • Quicker than pen & paper
  • Quicker than entering waypoints with meaningful names into your GPS device
  • Only meta data technique that can be done securely while driving car or bike (because you might not need your hands for recording)
  • Attracts less attention from the public than stopping to enter information on your phone at every waypoint
  • Gets details that cannot be seen on GPS logs
    • Street names
    • Street types
    • Access Restrictions
    • POIs
    • Linear features parallel to your track (e.g. rivers)

Possible problems with audio mapping

  • In some languages (such as English) there could be many possible spellings for one sound. You can spell out, but it can be hard to notice the ambiguity to remember to do this when recording.
  • It's easy to miss apostrophes etc.

How to record

If your area is already mapped, or if you have good satellite images, you might use audio mapping on its own, but in most cases, you will use a GPS at the same time. If you use GPS + audio, you might use two devices for recording and a third one for editing at home. Of course, you can also use your laptop for all three tasks. There are some quite fancy dictaphone-type devices for sale with features such as voice activation. Alternatively, many mobile phone and portable music players have a voice-recording feature. To chose devices, look into the Hardware Guide. Sometimes it might be useful to combine audio and photo mapping, see Dictacam.

When you are on the road, just say what you see. You might feel like a tour guide, and people passing you will look very confused :) You'll get used to this.

Remember to spell names that differ from the usual/common spelling when they have an equivalent pronunciation, e.g. "P-R-I-E-S-T-L-E-Y ROAD" is a better recording than "PRIESTLEY ROAD" which might get entered wrongly into OSM as "PRIESTLY ROAD" if the more common spelling were used when listening back to the recording (similarly, "MORE" and "MOOR", etc).

How to synchronize

Your audio data will be worthless if you cannot tell what recording belongs to which location. Be sure you understand this, or the next chapter will make no sense to you.

There are very different ways to ensure this. Depending on your choice the correlation between voice and position may be as good as 1 meter or as bad as 500 meters. And depending on your mapping style you might need or might not need this accuracy. As a beginner, you might want to try contextual synchronisation at first, advancing to the other techniques later.

Along with the synchronizing problem goes the question: Do you make a new record for every POI (maybe 6 seconds each) or do you record one big file (maybe some hours at once). Some mappers tend to make one big file, but using pause as long as there is nothing to say.

NEW! Continuous sound tracks are now much easier to work with in JOSM. JOSM can now play and synchronize one long, continuous sound track with waypoints or frequently sampled points from a GPX track. So you can skip the long gaps in a continuous recording and end up at the right place. See JOSM Help for step-by-step instructions for how to use this in practice, and below.

Implicit synchronisation

If you use the same hard- and software for GPS and audio recording, they will be synced automatically. There is a chapter on this called #Audio_recording_with_GPS-aware_devices

Contextual synchronizing

This has been the most popular method until now. If you speak out every turn you make, you might follow where you are by looking at your track log. Then you can easily map street names and the like. If you lose context, you might not be able to resynchronize, especially if you are in a rather unknown and unmapped area. Contextual synchronizing only works when there are many turns, so try to avoid driving straight over a crossing. If you're on a long, straight road, but there are POIs, this is the wrong technique for you. You might help yourself driving a small circle and saying "there's a pub at the circle".

You can now automate this in JOSM. This is method 2 for audio mapping as described in JOSM Help, which gives step-by-step instructions.

Waypoint synchronizing

Many GPS devices have the ability to mark waypoints, which will be seen in JOSM. They may have meaningful names, but entering them is hardly possible while driving. Luckily, your device might assign an automatic name like "Waypoint 242". Then you just do a voice recording "Pub at Waypoint 242".

In JOSM you can apply an audio sound track to a set of waypoint markers, synchronize at the first waypoint point of a long recording (say "NOW!" as you press the button), and then play starting at an arbitrary waypoint. This is method 1 for audio mapping as described in JOSM Help, which gives step-by-step instructions.

Time synchronizing

This method has been known for a long time from Photo mapping but is relatively new to audio mapping. It involves the timestamps of your audio recordings, looks up the GPS data to know where you were at this time, and displays the recording location on the map.

Time synchronizing is very powerful, so you might want to use it even if your device has poor or no timing information. If you have accuracy of a single second, you can correlate voice and location as accurate as mapping can be. I call this power audio mapping. This empowers you to map POIs and side roads without ever driving through them. Just say "Here is a road at 4 o'clock, called Musterstaße, one way from here, cycles allowed both directions, speed limit 30" as you pass by. In the time you say this, you might drive 200 meters. To know where exactly the road is, say the "here" at the very point where the roads meet.

Timestamps with second accuracy
If your device does this, you're lucky and you can do power audio mapping from the start. If it does not, don't worry, but read the next chapters.
Even with timestamps in the form 11:24:34 your recording device will have some lag compared to GPS times, making your timestamps very inaccurate. More than 5 seconds difference will render your recordings useless and even harmful, because you will map things at wrong places in wrong angles. Other mappers may trust your wrong data and mistrust their own, leading to chaos. So please be sure to make the spatial synchronisation mentioned below. You will have to do this only once as your time lag will be constant.
You may then use many small audio snippets or a single one, just as you like. If you're using a single one, you might like to consider JOSM's automated time synchronization as above.
Timestamps with minute accuracy or no timestamps at all
Minute accuracy is too bad for serious mapping. You either use contextual synchronizing, or you make your own timestamps. Just speak out the time. If this time is known to be exactly the GPS time: Ok. If this time may have more than 3 seconds lag to the GPS time, do a one time spatial synchronization. If you do not have any clock with second display, you will have to make a spatial sync for every single record. In this case, you should consider one big audio file, as the sync is time consuming. I bet, you do not want to do this more than 3 times on a single mapping day!

Spatial synchronizing

This method enables you to all the advantages of time based power audio mapping. It gives you sub-second accuracy even if your device has no timestamp feature at all, and it gives you a way to ensure the accuracy of existing timestamps even if your clock lags.

First, you start GPS recording. After you have a fix, wait 2 to 5 minutes so your GPS signal will reach its full accuracy. Go to a suitable place (you will know when a place is suitable after you have read this text...) and look for a point on the floor that is visible and identifiable (maybe a line, a gully cover, etc.). Say "doing a sync very soon", accelerate and pass this point with at a certain speed. If you usually do you mapping at 15 to 35 km/h, then pass the point at a minimum of 25 km/h. At the very moment that you pass it, say "NOW!". Then turn around, possibly saying "turning around to do the second pass" and pass the same point again at that high speed, saying "NOW!" again. Pass it from the opposite direction (180°), if you are on a linear road. If you are on some kind of area, it may be better to pass it from 90° compared to the first pass. But never do both passes from the same direction, this won't give you any result.

If you chose the 90° version, you will see the crossing point on the GPS log. At 180° it is not clearly visible, but all the needed information is there.

How to import into JOSM

As we learned in the chapter How to synchronize an audio recording must be syncronized and to this aim associated with a GPS track. Therefore the import feature for audio files is not at the place where you might look for other files to import. It is not within the file menu (which is usually the first place to try) but it is in the context menu of GPS tracks. Your loaded GPS tracks can be found in the layer dialog. This way an audio file is associated with a GPS track by simply importing this audio file.

Hint: JOSM supports at the time of this writing (March 2011) only audio files in WAV format.

Audio recording with GPS-aware devices

If you use your PC, mobile phone or PDA, you don't need the sync methods above, as these devices will take care of it. There is an article about PDA Voice Recording. A list of software follows:

Gpxrec for Android

This app record a clip and a gpx when launched and self close.

It's possibile to disable the self close function and use the rec button for take many other voice notes.

The app record only if the gps have a fix

In the gpxrec you will find many couples of file, a gpx and a 3gp

The gpx contains the coordinates and te link to the 3gp file.

The 3gp file contains the audio

This app has some trouble at first start, but when all the permissions are enabled run correctly.

Please write to iw1gfv user if you can debug and improve this app

The app can be downloaded here:


The application gpstrigger is able to record a voice from a microphone and save it as WAV and its GPS position (from gpsd) into a GPX file. The latest JOSM is able to display this voice tags on the map. For easier handling you could connect a joypad to your notebook and start and stop recording with two buttons. This application is under heavy development. It uses the Gtk+ library currently. But it's planned to change it into a library to better integrate it into other environments.


Download a current SVN snapshot here.

GPSTrigger works now with josm-latest!

It seems there's a problem using the Java Sound API on some Linux systems. If you get an exception while playing audio files then try to kill your sound daemon (esd, artsd,...).

Hint: gpstrigger works with Linux only!

Example use in a car

bash/dialog dictaphone

This hack is a command line recorder that uses dialog and sox. You start/pause recording with the spacebar and finish with N. Then all the little files recorded will be concatenated into an .ogg file, which you can rename as you wish. I'll take a look at GPSTrigger though, but some of you might find this toy useful or at least funny. [1]


Manauton is a Linux CLI/nCurses application for recording sounds. The great thing for OSM is that it can work in autonomous mode, detecting speech and logging to separate files. It support negative latency (recording starts before manual/autonomous trigger) and it can put a click-track into resultant wave file to denote time (which is independent to file time/stamp).

The output wave file can be one large file, containing the sequential recordings (with click-track on each section) or autoincremented separate files. There is a small utility for detecting/decoding the click-tracks.

No development for a while, but worked last time I played with it... it appears that the alsa library has changed since manauton was written. To get it compiling use the following at the top of Manauton/alsa_pcm_reader.c

//#include <sys/asoundlib.h
#include <alsa/asoundlib.h>

Note: I've confirmed this application works on multiple desktops, however it does not work on my laptop. I believe there is some issue with CPU speed switching, or other power saving system.


Empass is a compass application that displays the driving direction from gpsd. It shows a nice animated compass and it's fully themeable with Edje. This library is part of the Enlightenment Foundation Libraries (EFL). So you need to install this library from the Enlightenment CVS. Read this website how to do this.


Download a current SVN snapshot here.

Hint: Empass works with Linux only!


osm tracker pic1

OSMtracker is an application for the Windows Mobile PDA/Pocket PC with Android port as well.

The main purpose for the first release is to do quick (voice) waypoint annotations when driving a car or on a bicycle.

more info on the OSMtracker (Windows Mobile) or OSMTracker_(Android)

OSMtracker for Android is able to record audio notes when clicking on wired headset button (but only when app is active in foreground for now)

Gps Event Sync

Gps Event Sync is command line tool to convert a list of events (transcribed from an audio recording) and a GPX trace into a list of GPX waypoints. This may then be opened in JOSM or other editor for further work.


Intially made for MTK based loggers, BT747 can position any file on a map (well advanced dev version). It runs on Windows, Linux, Mac OS X, Mobile Phone (J2SE), Palm, WindowsCE. "Tagging" is currently available only on the first three OS's but is intended to be extended to the other ones.


GPSMid is a Java ME (J2ME) application that will run on most mobile phones. It is primarily an OSM router / map display application, but can be used successfully for audio mapping if it is paired with a bluetooth GPS receiver such as a Nokia LD-3W. It allows an audio clip to be recorded when required, and creates a GPX waypoint which is associated with the clip. Some processing of the files is necessary to use them with JOSM, as described in GPSMid. is a PyS60 application that allows you to save a waypoint in a GPX file and assign a voice recording to it for use in JOSM


avp2wpt is a java application that parses a gpx trace file, compares times with audio/video/photo (or whatever) files, and creates waypoints accordingly.


OsmAnd pic1

OsmAnd is GPS Navigation and map application that runs on many Android and iOS smartphones and tablets.

It also allows for basic POI adding as well as recording audio (and video) notes, although only via screen interface (headset button to activate audio note recording is currently not supported).

Field reports

By David.earl

Using a voice recorder is much quicker than paper when mapping by bike. The track points you collect are ordered and JOSM (for example) can show them connected if you set the option. This means you can see the route you took and follow the voice recording in order. So you already have the topology, you just need the names of things.

An Olympus VN-480PC voice recorder dictaphone

I found this little digital gadget, about the size of a mobile phone (or GPS receiver). It cost me £40 from Argos, though there are many others. It can upload WAV files to a Windows PC, though in fact I've found it is easier to play back on the recorder itself (connected to the PC speakers, so that I can just press the pause button rather than repeatedly switching windows between a PC audio player and the OSM editor), so now I'm just keeping the WAV files as evidence of source.

I tried continuous recording in the hope that I could synchronise with the track point times, but that was too hard and I got fed up of "silence" consisting of heavy breathing and traffic noise while cycling.

The background noise out on the street is too intense for the voice activation to work, so now I am pausing recording between voice annotations. That gives an almost continuous playback which is only a few minutes long and easy to apply street names and features from it in sequence after creating the nodes, segments and ways.

I tried using a headset to support the microphone, and the clip mike that comes with the recorder, but I concluded in the end it was easier to slip the device out of my jacket pocket or bar bag and speak directly into its built-in microphone. Mounting it on the handlebars was OK, but I found it too easy to press "stop" instead of "pause" or "continue" when mounted like that.

(Actually I've gone back to using a headset now. Playback on this device continues across file boundaries, and the record button does both 'resume after pause' and 'start new recording' so it doesn't matter much if I press stop rather than pause. Though perhaps marginally less convenient, in the end I was worried about the safety aspect of hand holding the disctaphone. David.earl 10:49, 27 September 2007 (BST))

If I note "dead end" so I know where I've turned round and retraced the route, then it is mostly sufficient simply to say "left/right into <street name>" and note context where there might be ambiguity on the track log and features like pubs, post boxes and the like (either by context if by a junction, e.g. "The Red Lion pub on north-east corner of junction with High Street" or by putting a loop in the track log by cycling in a big circle or going round the pub car park, which is quicker and less fiddly than trying to mark a waypoint on the GPS receiver). Likewise where a street ends and turns into a track, or where the name changes without any change in direction to mark the context.

The briefer the audio notes the better: short playback means quick editing time back at the computer.

By Chrismorl

I find a bit of redundancy in the audio commentary to be helpful in case of obliteration by passing cars, etc: "turn left into Cross Street, turn left from Cross Street into Pye Avenue, a cul-de-sac, P,Y,E... return Pye Down Avenue... turn left into Cross Street..." My device is a Matsui TRQ-10D which has timestamping, but no computer connection (no permanent evidence, but needing is very unlikely). It was only £15 at Currys. However, it did lose data on one occasion after needing a reset. I find I don't use the timestamping: the commentary is sufficient. As mentioned above, manual starting and stopping of recording is necessary. This device has the same beep to mark both, which is a bad idea because it is possible to get out of sync - recording the heavy breathing and no information - if you haven't checked visually what its doing.

The ideal device would be one which you wore on your lapel (or had a separate microphone) and had a remote on/off switch with two feelable positions which you mounted on the handlebars. This is a bit specialised for a commercial device especially as I see today that the OSM market will only exist in the UK for another 18 months... Chrismorl 17:17, 24 November 2006 (UTC)

By Frank

An Olympus VN-120PC voice recorder

I'm using a Olympus VN-120PC. It's a nice small device for about 50€ (in 2007). There is no Linux software, so i had to reinstall the w2k that came with my test PC on a removable disk to use it.

Good things

  • does by second timestamps and saves them in the wav file
  • good microphone
  • syncs the recordings to the PC (but win only)
  • syncs the time from the PC (but win only)

Bad things

  • needs special software (win only) to sync to a PC
  • does only 3 x 100 recordings, even if there is space for more

My usual procedure is:

I've written 2 Perl scripts to handle the wav files. Contact me if you are interested. Here is the timestamp script.

  • record the time from the gps at the start
  • go on the mapping trip
  • transfer the data to a win pc, copy them to my usual linux system
  • run to create a list of the wav files, recording time
  • edit the list
    • add the time delta (recorded gps time)
    • listen to the wav files and take notes
  • run with the recorded gpx file and the list file to create a waypoint gpx file

Frank 23:17, 6 February 2008 (UTC)

By Villus

some of the Olympus Voice-Recorders work with odvr

By John07

I own a typical MP3-Player (TrekStor i.Beat joy 2.0) with an internal microphone. The recording works quite good, but i wanted to have an external microphone for better handling while biking.
After i questioned talk-de about solutions for that issue (the Line-In doesn´t work with typical headsets) I removed the internal microphone and attatched a cable with an external microphone (a small one from an old dictaphone) to the pins of the internal microphone.
If you want to do the same (on your own risk!) write me an email for more information and pictures. John07

By Babstar

Thomson-Lyra 2GB MP3 player, model 6675 Available in Australia from Dick Smith Electronics, around $US35. Similar models should be available globally.
I looked around at all sorts of digital recorders, and none really fitted what I wanted for the right price. The Thomson caught my eye as it recorded directly to .wav format (for JOSM) and connects to the PC via USB mass storage, good for Linux & Mac.
One drawback of this model is the lack of an external mic jack. The on-board mic is good, sound quality is excellent when holding it at a comfortable distance from the mouth. The volume tends to drop off when sounds are more than around 75cm (2'), although still audible. I calibrated the unit and found it was spot on after approx 1 hour of recordings. JOSM unfortunately couldn't read the .wav files produced by the unit initially. Some investigating showed the unit uses compression in the .wav format. This however can be easily fixed with SoX via conversion to the raw format.

# sox -V foo.wav -r 44100  foo.raw
# sox -V -r 44100 -b 16 -e signed-integer foo.raw foo-new.wav

The converted file can now be read by JOSM. One advantage of this compression is the reduction of the file size on the unit, approximately 15~20Mb per hour, this comfortably give the recorder about 80 hours of recording time. Once decompressed the resulting file is about 10 times the original size.

By Gilles Bonnard aka leblatt

I liked the idea of audio mapping, so I tried Mobile Trail Explorer. I had compatibility and stability issues, so I reverted to good old trekbuddy. I really missed the voice notes, and I wanted to use pictures too, so I thought of a little utility that I called avp2wpt : audio/video/photo to waypoint. I record my trip as a gpx file, while taking voice notes, pictures or videos. Then I put them all in a folder, and the program scans the files and creates waypoints for me. Its that simple, I wonder how I couldnt find such a script before.

By Chris Krahe

I do my recording while driving at 30 to 70mph, often with the convertible top down (i.e. noisy). So, I needed a device that could capture my voice despite the wind noise, was lightweight and could clip to my seat belt, allowed easy one-button-record, had a dim-able screen, and supported second precision. I also would be making lots of small recordings, maybe 5-30 seconds each, so it had to be suited to that. The Olympus VP-10 digital recorder (just under $80 US on Amazon) fits all of these requirements wonderfully. I found the "pocket" recording scene (preset collection of audio recording settings) captured my voice clearly, clipped mid-chest with head and eyes up looking at the road. The record button is simple and unambiguous, creating a new time-stamped recording each time I turn it on. The device is also quite-configurable, which is nice. Two nits: The device doesn't appear to show the seconds of each timestamp, just the hours and minutes (but I can obtain them when I transfer the MP3s to my computer); and the clip isn't as strong as I'd like, but strong enough for its intended shirt-pocket purpose (for me it fell off under heavy braking and high-load turns, albeit only a handful of times in about 400 miles of driving).