Arabic/English Tagging Discussion
Which language to use for the name tag?
- Metehyi would like to discuss tagging scheme for Egypt name= القاهرة, name:fr=Le Caire, name:en=Cairo etc....the state of the mixed arabic/english is not satisfactory , because everything gets overlapped.
Here is the continuation of an email discussion started by Metehyi :
- From tellmy@gmx_net
- Subject Re: arabic place names
- Date Thu Feb 05 19:26:59 +0000 2009
- Hi Metehyi,
- thank you for sending this mail. I actually would have liked to contact you, but I am not too familiar with sending mails via openstreetmap (or even identifying who made entries/changes and how to contact).
- And yes, I totally agree that it is better to have a fruitful discussion and agree on a common approach. After all I assume we have the same goal: to have a useful map of Egypt for all...
- The reason why I am taking out the arabic names from the "name" key is that even if some internet browsers are capable of displaying them, other devices are not. In particular, navigation devices start displaying hieroglyphics instead, so the map becomes unreadable because of that. And who knows which other devices have the same problem.
- The main driver for me personally is to have an electronic map that can be used for navigation systems. I have been searching for years for such a map - there simply isn't anything useful. This is the reason why I have started all that work some years ago. And you might have noticed that I have contributed quite some bit to the current map.
- So you may now say: how egoistic, just because Tarek wants to use the map on navigation systems...
- But there is another reason. Which is that from a structural point of view it is not "clean" to mix languages under one key. It is a bit similar to putting street name, number, ZIP code, city and country all under one key to save space. The data structure clearly foresees different keys for different languages. And in future I am sure that browsers will be able to display the "name:ar" key. So putting each language under its key is cleaner, more future proof and allows every client to display whatever it is capable of in a correct way.
- For this reason you might have noticed that I have never DELETED the arabic name. I have always put it under the key "name:ar" because there is where it belongs. And there it adds value. I would have added the arabic names myself, but living abroad I am lacking an arabic keyboard, so it is very difficult for me to add this information...
- So to put ist short, all that I am saying is: let's put the arabic names, but let's put them where they belong.
- Does this make sense to you?
- I hope I could clarify my motivation and I am happy to hear your view on this.
- By the way, you don't believe how happy I am to finally have found a fellow mapper, driven by the same spirit. I would really like to exploit possibilities of exchanging experience and maybe optimising our work (e.g. splitting up regions or smilar).
- With mappers' greetings
- Telly, I suggest using a unified tagging scheme where the name appears in arabic, there are many problems with twin language tagging like en/ar
- 1- the maps are overcrowded with unreadable text, on a small GPS device, they will show up as black spots .
- 2- this is a community work, it is hard to satisfy every new tagger (one could say why english/arabic and not french/arabic ? most people and future users of such maps are local people who would like to have places named in their original language. The task is huge to map a single city (what about a whole country ) so we need as much contributors as possible. and the language is a big argument.
- 3- it is very difficult to edit/correct spelling for names afterwards, due to left-to-right and right-to-left mixture.
- 4- the spelling in arabic is unequivocal for example there is one and only one way to write شارع جمال عبد الناصر whereas in english one can write Jamal Abd el Nasser , Gamal abd en Nasser , Jamale Abd Al Nasser , Jamal Abd al Nacer or another example شرم الشيخ , writen Sharm el sheickh, or Charm al Cheich or Sharm el Sheikh etc... if one later needs to search for a precise adress/street/place name it will be hard to know how to enter the correct name whereas it is unique arabic spelling point ! . Try for example in the search box left on OSM Homepage, near the tag where I am.
- 5- it is easy to generate language specific Garmin/other maps with a text editor and choose the language later like find/replace all name= as in (name=القاهرة) with name:foobar= , then doing a search replace on name:en= as in (name:en=Cairo) and replace it with name= without altering the whole OSM Database (e.g. by working on a local XML formated dump) .
- 6- We find many small streets in the 1000 years old oriental cities, the combined tag will not render due to it length or the rendering software will cut off the string. and show half arabic and half engish name.
- 7- more arguments to come ;-)
- --Metehyi 22:29, 5 February 2009 (UTC)
- I have the following comments on those seven points:
- On 1- Fully agreed, and that was exactly my motivation to remove the arabic dublicates of names and put them under "name:ar" where I believe they belong
- On 2- I agree with you that this is a community work. And I guess that I simply took English for te same reason that you took it to write your text. Or why openstreetmap took it for the keys etc. French would have worked just as well, but English is simply THE most common denominator and by using it we would increase the community we could adrdress/serve.
- I do not share your opinion that most users will be local. In the contrary, I strongly believe that most of the users will not be, because such material is typically made for travellers. People looking at Egypt's map will be coming from all sorts of countries.
- I personally travel a lot and whenever I go to a place I look up the available maps on openstreetmap. In some cases like China people seem to have chosen a similar approach as you are suggesting. Please take a look at the Chinese material. It's all in Mandarin (I guess) and makes it completely unusable to the rest of the world. To me displaying the street names only in Arabic would contradict the spirit of being open suggested by the name "openstreetmap".
- Also, to me it sounds highly likely that whoever will be contributing to this work will have some minimum knowledge of English. It is hard for me to believe that someone who is not familiar with the latin alphabet (which is basically all that you need) can work his way through potlatch, JOSM or similar.
- On 3- True. This problem would not exist if we stick to putting the Arabic names under "name:ar". Again, I strongly believe that from a structural point of view this would be the more apropriate approach.
- On 4- You have a very valid point here. And for exactly this reason I have used for the most common names consistently identical wirtings:
- Al-... (instead of El- which could have been used just as well)
- Sheikh ...
- Mohamed, Mahmoud, Ahmed, ... (whereas I know there are different ways of writing these names as well)
- But I do agree with you of course that this approach will not solve the issue to 100% and only writing it in Arabic would. But this is something all maps that are not written in Arabic are living with and it obviously seems to work fine.
- And in any case, searching for the arabic name works just as well if the arabic name is stored under "name:ar" instead of "name". Just try it out...
- On 5- This is a very good idea. Frankly speaking I have not thought of it but it should work. I will definitely try this one out. Thank you for this.
- On 6- Completely true and this is why I come to the conclusion that we should only keep the English version under the "name" key
- On 7- Happy to hear them :o)
- --Best regards, Tarek
- I don't know what's the best way of tagging now. Japan tagging, for instance, put both name:首都高速道路 (Metropolitan Expressway)
- What's up in others Arabic alphabet states ?
- --Esperanza 10:04, 8 February 2009 (UTC)
- There are a few arab states where we can talk about mapping projects, usually some travellers from Europe/America are uploading their holiday GPX Files and consequently tagging in English only.
- The japaneese map is partly bilingual (big cities), but I noticed that only major features like motor highways and touristic places are so. The other issue is that Japaneese people are 80 may be 100 Millions people, arab nations are more then 800-900 Millions, a lot of them dont even speak english but french as a 2nd language.
- One of the problems in Potlatch/Merkaartor/JOSM is that writing on the same row English-Arabic is very tricky, making future edits impossible.
- --Metehyi 18:43, 8 February 2009 (UTC)
- More ressources to tagging problems
- Please read the street names recommendations here http://wiki.openstreetmap.org/wiki/Editing_Standards_and_Conventions
- --Metehyi 19:37, 8 February 2009 (UTC)
- I second Tarek's opinion. We are trying to create a globally useful, universal geo dataset of the whole world. Any database needs a pivotal point of reference, or a "primary key" in order to facilitate a standard and consistent means to look up the data. The "name" tag is the perfect (and only) key to fit that purpose.
- - Regarding accessibility. I also lookup maps of places I travel to on OSM. Imagine if I surfed to Moscow's map and saw the names in Cyrillic? It would be very frustrating for someone to see characters only the locals can read instead of writings he/she can remember and use to, say, locate the street where their hotel is. This would be the case with millions of potential OSM users visiting Egypt each year (whose contributions we also can't overlook!)
- On the other hand, all would-be users of OSM who are native Arabic speakers are comfortable reading the English transliteration. Therefor this way no one is left out.
- - Regarding consistent transliteration. I'm glad you guys raised that point, because we definitely need to establish a standard here. But this is needed regardless of which key will hold the English version. I have some thoughts on this issue, but I think it's best discussed in a separate, dedicated thread. Once a standard is established and applied, search will not be a problem.
- - Creating a custom, localized version using the language of your choice by selecting which tag is rendered/displayed, as you point out, will always be possible, and I think the argument actually works the opposite direction: if you need to create an Arabic map of say, Africa. Which tag do you use for names? If 'name' is always in English and 'name:ar' always holds the Arabic version, it's a straightforward process. If, on the other hand, we use name=Arabic scheme, it's tough business as you'd need to use different tags inside different country boundaries. Much more complicated. Not what you want in an open dataset.
- - Single bilingual tags are not a good solution due as you mentioned to space constraints, portability, and other issues. We all seem to agree on that.
- --Estr4ng3d 20:41, 3 December 2009 (UTC)
- I disagree in some points. When I look at the Map of Germany, the name tag carries the german name of places, so I would also use the arabic name for the name tag in arabic speaking countries. I would make an exception for important tourist destinations though, using bilingual name tags for (only) those. In addition, I would put the arabic name in a name:ar tag too (so the arabic name is in name and name:ar) and if the place has a well known english name, put that in a name:en tag, german name in a name:de tag etc. . If a place is not well known in the english language, then don't set a name:en tag, but put a transliteration in the name_int tag instead. A rendering engine can produce an arabic map by using the name:ar tags, an english map by using name:en tags (and name_int if not available) or maybe a map in farsi by using name:fa and name:ar if name:fa is not available. If someone wants to render a map for local use, just the name tag would be used.
- So an example for a tourist destination would be name=Cairo القاهرة, name:ar=القاهرة, name:en=Cairo, name:de=Kairo and for a "normal" place name=بنها, name:ar=بنها, name_int=Banha
- --User:Lyx 21:53, 3 December 2009 (UTC)
- Actually Europe is a completely different case, since you'd be using the same Latin characterset and so the names would be just as informative/useful for a reader who does not understand the langugae. For example, name=Hannoversche Straße is clear enough for anyone using the map who doesn't know German, and, in fact, adding name:en=Hannoversche Street will be a bit redundant as it would not add much information. This is hardly the case with Arabic; the script is of no use to the non-speaker.
- Once again, of course, you can choose the tag(s) you want when making a custom map, producing the language you want, no one is contending that fact. But we want the rules to be universal in order to have a consistent way of building that map. Think about a continental map, or a cross-country name tag query. It's not very convenient to get different languages in your answer depending where each elements resides.
- I disagree with mixing languages in one tag. Consistency is a major goal in my opinion; it wouldn't be a "clean" solution if I query the name tag once and get a complex string containing two languages, and then query the same tag once more on a different element and get one language. Also, whether a place is "touristic" or not is highly subjective. In your example you considered Cairo touristic but not Benha. I can guarantee you someone from Benha would disagree and revert your edit... and the same will happen with other places, etc. i.e. edit wars.
- --Estr4ng3d 11:15, 4 December 2009 (UTC)
- I totally agree with abdelrahman. I get frustrated when I open the map to check a certain area and find out the I can't even read it. So I suggest to keep the default tag name always in English and move the arabic to name:ar, which was my current target in editing the map to provide the suitable english and arabic tagging on the map.
- As for the multi-pronunciation for the arabic words in english, I suggest to add a page on the wiki that contains a unified english equivalent for every arabic word that could be used someday on the map.
- --eldolly (AKA Ahmed) 5 December 2009
- I Don't understand, how you can't read certain area? what is your OS, your browser? are you browsing the map from lynx ?!! please give us a clear example of an area or POI or street or whatever you couldn't read in arabic!!! (I don't think you will find...) --Jarkas 17:05, 5 December 2009 (UTC)
- There is no unified and probably will never be a unified latin equivalent of arabic scripture and pronounciation, this is not a new problem. Or can you provide us with an innovative solution ! --Metehyi 18:34, 5 December 2009 (UTC)
- I did not mean that I couldn't read the arabic words I meant for many other languages which do not use latin alphabetics. As for the solution, I have already suggested to add a new sub-page in the Egypt's project (or even in the arab speaking countries) page with a table for a convenient equivalent English names for the arabic ones. in This table we should as mappers agree upon a unified set of names to be used. For example: we can all agree that the equivalent for محمد is Mohamed not Mohammed or Mohamad or whatever, and so add Mohamed to the table as the most agreed name for محمد and so on.
- eldolly , this not as easy as you think , think about the letters ح ، خ، ه and and the many accents like shadda , damma , fatha and hamza and consider that sometime you find regional pronounciation or dialects. How would you translate تلعة الثعبان ? Talaat as so?ban , but if you have also طلعة الثعبان أو طلعات الثعبان etc.. how would you transcribe them in latin ? your above idea is a job for linguists , not for mappers. --Metehyi 21:29, 9 December 2009 (UTC)
- I'll give an real example why tagging in English is so BAD for Arabic users. Once I were in ميدان لبنان (Midan Libnan) and I want to go to شارع محي الدين أبو العز (Mohey Aldin Abo Al Ez), and since I'm not from Cairo and I don't well know the streets, I've found it a good idea to look at the map in my mobile (since I'm a mapper and have always the latest data from OSM). So I've tried to find the street on street, and I know it's near (less than 1 KM) but no way! I've tried the map and can't find it, I've try the search, also didn't find it. and so I spent more than half hour to know where is the street located on the map. and all of that cause it's written in Latin letters!! and I considered myself good in reading (so what about people who don't know English at all?!! I think if it was in Arabic I'll find it like what Egyptian say (in minute) but unfortunately that wasn't the situation at that time. Please take a look at the map and try to locate that (very difficult to find) street: http://osm.org/go/xmxPUIA2-- --Jarkas 17:25, 5 December 2009 (UTC)
- Globally the name=* tag should contain the name in the local language and the name in English (or the transliteration) should be at name:en=*. The fact that the default map rendering at [] then seems to show names that "western" people can't read (arabic script) is of no concern to those entering the data; it's already possible to make a map from the data, that always displays what's in the name:en=* tag, wherever it is available. Alv 11:39, 6 December 2009 (UTC)
- Whatever you put into the "name" tag (I prefer the local name, others might disagree); please make sure the arabic name is always in name:ar and the english name in name:en, even if one of them is also in "name". Only then is it possible to make a map where the mapmaker decides which language to use. lyx 23:02, 6 December 2009 (UTC)
- So here is my point of view:
- I am 100% supporting the position of estr4n3d and eldolly for the following reasons:
- • Having the default name tag in Arabic will limit the usage of the map to people who can read Arabic. For the rest (the majority) of the world it would be simply unusable. The spirit of having the map OPEN to everybody (reflected in the project’s name) would be violated
- • Having the default name tag in Arabic will limit the number of contributors to those who can write in Arabic. And even more, a lot of people who can write in Arabic may have the simple problem that they don’t have an Arabic keyboard and therefore would be excluded. I for instance would be an example for this
- • The example of Germany using local language does not fit at all as it is still in Latin characters and hence readable for all
- • The argument that transliterations are not unique does not hold
- o If the name is written in Arabic under name:ar you could still search for it in Arabic
- o We are talking about names. Names of places, streets etc. A substantial portion of them are originally foreign (e.g. Champollion street, Triumph square, ...), so you could argue that their Arabic transliteration is ambiguous as well and you should therefore use the original language. Otherwise you would find difficulties when searching for these names without really knowing how to write them in Arabic
- • Lyx’s preference on the language of the default name differs from mine (I prefer English). But I think he made a very good point, which is that the Arabic name should always be used in name:ar and the English name in name:en, no matter which of them is used in "name". I fully support this view and will follow it from now on
- • Even if there are some voices opposing, it seems to me that the vast majority of this group agrees on not mixing languages in one tag
- --Tarek 07:00, 11 December 2009 (UTC)
- عزيزي طارق، سأكتب ردي باللغة العربية لأنه لن يهم الأجانب كثيرا.
- لن يتم الإساءة للمشروع إذا كتبنا باللغة العربية، لأنها لن تصبح عديمة الاستعمال كما تدعي، حيث أن قواعد التسمية تقول على أن الأسماء يجب أن تكون باللغة المحلية وليس باللغة بالإنجليزية، وأنا أصر على اتباع القواعد وإلا فأننا نخرق قوانيين المشروع. الخريطة يجب أن تكون مفتوحة لأكبر عدد من الناس، ويبدو أن لم تلاحظ أن عدد من يستخدم أو سيتخدم الخريطة من العرب هو أكثر من 200 مليون، بينا الأجانب هم قلة قليلة لا تتعدى بضعة مئات. فمن يريد أن يستخدم الخريطة بلغته فليستخدم السمات الإنجليزية، وهذا متاح ضمن معلومات العقدة.
- استخدام الأسماء العربية سيحد من مساهمات العرب ولن يحد من مساهمات الأجانب، لاحظ أننا أكثر من 200 مليون. ومشكلة الكيبورد العربي غير موجودة على أرض الواقع. أنت تتدعي خرافات حاليا. عدم امتلاكك للوحة مفاتيح عربية هي حالة نادرة جدا. قابلتها بضعة مرات بحياتي، وهي لقسم صغير من المغتربين الذين لم يجدوا اللوحة في مكان اقامتهم. كلامك يشعرني أن الأسواق العربية مازالت متخلفة ولا تفي بمتطلبات الزبائن!!! متى كانت أخر مرة زرت فيها بلد عربي؟
- شكرا لأنك قمت بطرح مثال "شارع شامبليون" هذا الشارع بالذات موجود بالإسكندرية، وعلى ناصية الطريق يوجد لوحة مكتوب عليها حرفيا "شارع شامبليون" بالأخرف العربية، ولم أجدهم كتبوا "Champollion"، ومثلها شارع "هيبوقراط" و "دينوقراط" زالكثير من الأسماء، والقاعدة تقول أنه يجب تكتب الأسماء كما توجد على أرض الواقع تماما. كما يبدو أنك إما جاهل بقواعد خريطة الشارع المفتوح، أو أنك تحاول أن تخرق قوانينها. وبكلا الحالتين أنت ترتكب مخالفة، أرجو أن تنتبه لكلامك مستقبلا، لأنه سيعرضك للمسائلة.
- أنت تفضل اللغة الإنجليزية (أو تحارب اللغة العربية) هذا يعود لك، ولكن لا تفرضه على الآخرين، الخريطة ملك المجتمع وليست ملكك. إن كنت تملك جهاز جارمن ولا تستطيع عرض الأسماء العربية عليه هذه مشكلتك، وليست مشكلتي أو مشكلة الآخرين. تعلم كيف تصنع خارئط جارمن باللغة التي تفضلها ولا تزعجنا باقتراحك بسبب كسلك أو جهلك. لقد بحثت في كافة متاجر مصر ولم أجد جهاز جارمن واحد، والكل يستخدم الموبايل أو الكمبيوتر لعرض الخرائط. فلا تظن أننا نواجه نفس مشكلتك!!!
- عزيزي طارق، سأكتب ردي باللغة العربية لأنه لن يهم الأجانب كثيرا.
- الخلاصة: أنت تعارض الرأي وتفرض كل تلك الفرضيات لأسباب شخصية لا علاقة للمجتمع بها (لا يوجد عندك لوحة مفاتيح عربية، لا يمكنك عرض الخريطة على جهاز الجارمن الخاص بك) رجاءا قم بحل مشاكلك بعيدا عن المشروع. --Jarkas 12:38, 11 December 2009 (UTC)
- I am wondering why Jarkas is not using the English language so that everybody can understand what he is saying and how he is thinking.
- I can sense quite some aggression in his contribution which is very surprising for me because - though controversial - the discussion so far has been very objective and respectful from all participating parties.
- I can neither think of any reason that justifies turning the discussion in the direction Jarkas is trying to, nor can I tolerate this for myself. If this forum chooses to go that way, then this is the end of the discussion for me. I do not intend to go down that road.
- --Tarek 20:21, 14 December 2009 (UTC)
Hi Metehyi, I did not really understand why you have suggested this, but would you see any reason to change option A in the voting adding "No reverting back to english after arabization" as you suggested, and not have a respective change in Option B? If not, then I would suggest to add "No reverting back to Arabic after setting the default name in English" Thanks, --Tarek 18:39, 18 December 2009 (UTC)
- as I am only occasionally contributing to the map of Egypt, I'm not going to vote on the script to use for the default name tag. However, regardless of the outcome of this vote, please make sure that a map renderer has a chance to use a language and script of his choice for map rendering. That means, if you decide to use the arabic name as default name, that the arabic name should go into "name" AND "name:ar" tags; if the default name is to be english, then the english name should go into "name" AND "name:en". If you can agree on that, I suggest to extend the vote text in this way.
- --Lyx 19:13, 18 December 2009 (UTC)
- Lyx, you are very welcome to contribute to the map in Egypt , as is everybody else, nobody can prevent you and you can contribute in English, we just want to have a stable, nice looking map. The issue with the vote is more to prevent reverting tags everyday/week/month. --Metehyi 21:52, 20 December 2009 (UTC)
To the voting community, it has been over a month now that the voting on the default language in the "name" tag in Egypt has started. It seems that all main contributors to the map of Egypt that are eligible to vote have expressed their opinion. Since more than 2 weeks I have seen no more changes to the voting. We have 2 votes for using Arabic as the default language, and 4 for English. So to me this looks like a clear and stable vote in favor of English as the default language. So if we can agree on this result, may I suggest that we take it from there? Happy to have your comments, Tarek 22:05, 5 January 2010 (UTC)
- Ok, with the change of Esperanza's vote from English to Arabic we now have equal votes for English and Arabic.
- Any views how to proceed? --Tarek 22:06, 17 January 2010 (UTC)
Editing arabic text with a western style keyboard.
First you need to install arabic script support in your operating system. By switching back-and-forth you can toggle the use of arabic/western keys like in this figures: [Map of arabic keys 1] and [Map of arabic keys 2].
- As I don't have an arabic keyboard, I use the "web keyboard" at www.lessan.org to enter short text and use copy/paste to put it on the map or wiki. While that would be difficult for longer text, it is ok for short text like names. Hope that's helpful for other mappers.
- --Lyx 19:13, 19 December 2009 (UTC)
- Notice also that todays, most operating systems linux/windows/mac allow to plug two keyboards, e.g. one arabic (available everywhere for a few $) and one latin, and type your input in either one. --Metehyi 21:57, 20 December 2009 (UTC)
Can Garmin devices show arabic script in street maps ?
In principle yes ! Garmin devices are able to display arabic script, as you can see from the screenshots below.
Improving Mapping Quality
In the last few weeks I have done some editing on the map of Egypt and found, that a lot has been mapped already. But I also noticed that there is some room for improvements; my goal is to get the map data into a state where it can be used for routing software. Here are a few details and suggestions:
- Africover import: Comparing the roads imported from Africover to existing GPS traces and Yahoo lowres images, I found that the imported roads are sometimes up to 100 meters off from their real position. I have moved some of them and retagged them as "source=africover; yahoo lowres". I have also found that sometimes the Africover roads are in reality irrigation canals or railway lines. I have removed those that that duplicated existing roads, railways or canals when I noticed it. Many segments of the Africover roads have no real connection with each other or the rest of the road network, just multiple nodes in the same spot; this can be fixed by merging the nodes.
- On residential and other roads I sometimes see roads crossing each other without a connecting node; these connections look okay on the map but routing software would not know that you can drive from one road to the other in this place.
- Bridges sometimes don't have a layer tag.
- Bridges as part of a one-way street as well as connecting links between one-way streets sometimes go in the wrong direction, so routing software would find these sections undriveable.
- Streets are given only names , not differentiating between level or grade like Street, Avenue, lane , highway, bypass, etc... for example it is often found name=Jamal A. Nasser where it should be name=Jamal Abd El Nasser Street or Jamal Abd El Nasser Avenue etc... --Metehyi 21:59, 13 May 2010 (UTC)
The KeepRight! tool helps a lot in finding these and other potential problems in the map. However, don't change the map just to make the tool happy, sometimes the tool makes mistakes. --Lyx 08:48, 17 February 2010 (UTC)