Google Summer of Code/2012/Improve support for non-latin languages in Mapnik text rendering
The goal of this Summer of Code project will be to fix most (or all) problems related to non-latin languages in Mapnik.
Description
Mapnik has very poor support for rendering text in non-latin languages related to algorithms which lack robustness in the face of complex Unicode and RTL text. The problem is the placement finder which is the part of Mapnik which tries to determine where text should go. This involves selecting the fonts, calculating glyph position, rotation, etc. Basically all work that is required to render text except actually drawing the pixels is done there.
Help from the community
I need help from the community to find test cases for my code. If you know something that is rendered incorrectly please add a link here (preferably with a description of how it should look):
- User reports on OpenStreetBugs - can't check if it's rendering or spelling issue http://openstreetbugs.schokokeks.org/?zoom=13&lat=13.09537&lon=103.18181&layers=B0T http://openstreetbugs.schokokeks.org/?zoom=16&lat=14.22494&lon=104.07896&layers=B0T
- Add you problem here
Research
Pango
Pango's layout object are very easy to use and provide a lot of useful functionality, but I'm not sure yet if I can make them work with Mapnik's layout needs with reasonable amounts of supporting code. At first I was unsure about if it is usable at all, but now I'm pretty confident that it is a good choice.
Harfbuzz
Probably the most flexible solution in terms of shaping (from what I read) but I could not find any documentation. Also it's unclear how stable the API is. HarfBuzz is used by Pango and QT. So a solution using Pango would use HarfBuzz indirectly. Also HarfBuzz alone is not enough it would require additional libraries to support bidirectional text (like FriBiDi).
ICU
ICU seems to provide all features we need, but it looks like it is much more work to use them. For example Pango does the line breaking automatically but with ICU we have to do it ourselves.
Conclusion
- HarfBuzz is no candidate to use, because it doesn't provide all the functionality required and is hard to use.
- ICU does everything we need, but requires much additional code.
- Pango seems to support everything we could dream of with little complexity inside mapnik. It should be possible to drop quite a bit of mapnik's rendering stack. It requires rewriting some parts because processing has to be done at other steps than it currently is, but that might or might not be required with ICU, too.
I think Pango is the best solution.