Category Archives: translatewiki.net

Drawing i18ned text in images.

A picture is worth a thousand words, but drawing a word can be harder than one expects.

Usually it is a good idea to avoid text in images for multiple reasons. Foremost, images make localisation hard. It requires tools, some skill in image manipulation and handwork. Another benefit is the need to store only one copy of the image.

In some cases it is unavoidable to use text in images. In other cases… it is just used for lesser reasons. In this post I will not talk about layout issues, like limited space and inflexibility in image size. In Betawiki we have hundreds of languages, of which many of them are using poorly supported scripts.

PHP GD library provides two methods to draw text. imagestring can be used only to draw text in latin-2, so we can forget it immediately. The other one is imagettftext, which since PHP 5.2.0 allows to use UTF-8. Great, now we can pass all translations we have to it. The next problem is choosing a suitable font, since imagettftext specifically needs path to one in its parameters. As we know, there is no font to cover all scrips, and too many fonts manually map language codes to them and require everyone using the code to install just those fonts.

The only way to automatically choose a proper font for a language (script) code is fontconfig. I have written a wrapper, which calls command line utilities of fontconfig to fetch the most suitable font. This does not solve the missing font problem, but if there is a suitable font in the system and fontconfig knows about it, it will be used. And yet, there is still problems like wrong rotation for Japanese.

The big question: is there any better way to do this?

Page translation + documenting = translated documentation???

Not yet at least. I was sick for few days and actually worked mostly on page translations this week to get it working. But I also wrote some more documentation, but it is not yet published. The wiki page translation should now work with some caveats, and it doesn’t yet have all the features I wanted. See a very simple example here.

It can now display the languages and how complete and up-to-date they the translations are approximately. Suitable translation is not yet automatically selected for the user, but at least the user can now see which languages are available and view them, as opposed to the previous version.

This projects ends in a week. It has been very nice, and I still hope I can recover a little from the problems encountered in this task. Let’s hope the summer doesn’t end this week also, even though I already have done my schedule for next study year in the university.

Status update

This update is somewhat delayed, a bit too much even in my opinion. There has been some problems with the wikipage translation design I started with, like broken and complicated caching. I’m now trying a different approach but I’ve already spend more time on this than the two weeks I have allocated for it. Tomorrow I have an exam, but I’ve planned to spend the rest of the week to try to get something usable out.

After that I change to the other items, two way changes and documentation. If I can finish them quickly I could resume working on the wikipage translation if needed. In any case it looks like that I don’t have time to work on the optional features.

New week – new task

The time for stats ended a little over week ago week. I added little new features, like per hour granularity and counting of active translators instead of edits and there is now a simple GUI to generate a code that can be included in pages for those who don’t care to remember the parameters.

My current task is wikipage translation. I have been waiting for this task and I’m very excuisited to see what will come from it. The basic system should already work and it is being tested on an example page. That means it is possible to mark content for translation, translate it, and changes to the content will invalidate the translation. But as you can see it is still missing a lot, like for example selection for language.

Status update: Statistics etc.

My progress on implementing nice statistics has been an on-off trip. Both MediaWiki and FreeCol are going to make releases soon. And then there is all kinds of bugs here and there I feel obligated to fix. During the weekend I managed to fix a very bad memory leak where one of our scripts was using all our memory from the server, compared to quite stable 30M after the fix. I really want to thank milian from #geshi for the help using xdebug and his nice tools to identify the cause.

Gettext and Xliff: Nothing much here. Still haven’t tested msgmerge, so it is to be seen how well it works.

Other features: Special page alias translation got a really big boom. Suddenly the number of supported extensions has grown to 23, and we have already “produced” hundreds of translations in many languages. Message formatting checks got little improvements, and now that the leak is fixed, we can update those regularly too.

So let’s go ahead to the stuff I was meant to do: Stats. Thanks to a friend who suggested using PHPlot, I have managed to make pretty good progress on this anyway with all the other stuff going on. I think I’m going to explain my progress by using few examples and eye candy. Click the images to show full size versions if they are scaled.

First we have a graph of showing the number of translation edits per day in Betawiki.

All translation edits in MediaWiki

It is also possible to compare projects:

Edits to MediaWiki and FreeCol compared

And then we have graphs in our portals:

Finnish translation edits

Or if you want to compare how your worst (best?) rival is doing much better than your language:

Comparison of Finnish and Swedish activity

Or do it only for one project:

Comparison of Finnish and Swedish activity for mobile broadband configuration assistant

We also have graphs in our project pages.

As you can see, the labels could use some polishing. There is no GUI for generating these, but it is easy if one knows the configuration parameters. It is possible to include them in pages with the special page inclusion syntax: {{Special:TranslationStats/language=xx;days=nn;group=id}} The size can also be changed with width and height parameters.

Every graph is visually about the same. I kind of like it, but YMMV. If this feature turns out to be very popular, I have to figure out how to do more aggressive caching. The data is is fetched from Betawiki recent changes table. It means that external changes are not counted—one more reason to use Betawiki.

Localisation of images

Amidst of fixing bugs I remembered a old feature request for localising images. One image may be worth of thousand words, but what if those words are in a foreign language? Now it is possible to replace anglocentric images in the user interface with localised ones. I use this opportunity to add some images to my pretty boring blog entries :)

So here is the current default toolbar in MediaWiki’s edit view:

Here is the same when using Arabic as the user interface language:

And one more example, which is for Belarusian (Taraškievica orthography):

Checks and bugfixes

This is a status report again.

Possibility to translate aliases for special pages didn’t cause any mass movement. Let’s wait for a another week to see what happens. In the mean time I’ve improved the check framework. There is now own checks for FreeCol, existing checks were improved and it is easy to add checks for new types. It is also possible for translators to get list of messages tagged by the checks. The down side is that running checks is slow, so that the list doesn’t update in real time. Anyway it is better than before.

Gettext got improvements and fixes, thanks to a new testing project. Mobile Broadband Configuration Assistant is project of another Summercode Finland participant and friend of mine that is enabled for translation for testing. The same guy made me to fix one bad usability issue that was long due. It was a quick hack, but a start for better.

Xliff support is still stalled, but I managed to contact the nice pootle developer I met at FOSDEM 2008. We talked shortly in which ways we can co-operate (less in terms of code) and a little which features should be implemented. Didn’t get much out of it, as Xliff is quite a new standard, and support in other tools is also in emerging state. This confirms my thoughts that it is not worth to shoot in the dark, but do features for it when there is a need or support for those.

The special Special Pages of extensions

First phase of my Summercode Finland is almost ready. Support for native Gettext projects is in testing phase and Xliff support is waiting for comments about which parts of the Standard should be supported. In other words, there hasn’t been many changes to file format support lately. This week I fixed some bugs found in Gettext testing which actually affected all groups not depending on the file format. For some reason every time I look at my code I find places to improve and clean up it. I cleaned up the command line maintenance scripts and sprinkled few headers for copyright and so on. In the process I managed to introduce handful of new bugs, but that happens always when I code :).

But let’s talk about the post title. It means the names of special pages shown in your browser’s address bar are no more sacred but can be translated like almost everything else. Now that Firefox 3 has been released many current browser even display them nicely and not in some unfriendly percent encoding like %D0%97%D0%B0%D0%B3%D0%BB%D0%B0%D0%B2%D0%BD%D0%B0%D1%8F_%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0 instead of Заглавная_страница.

Actually, we have supported this for a long time already, but only for MediaWiki itself and not for special pages provided by the MediaWiki extensions. Special pages can have multiple aliases, and all of those can be used to access it, which means that they need some special handling. All of the complexity (yeah right… one do-while loop) is fortunately hidden behind a variable.

To make your extension support translating of special page aliases, you only need to put one line of code and create one file.

$wgExtensionAliasesFiles['YourExtension'] = $dir . 'YourExtension.i18n.alias.php';

And that file should look something like this:

<?php
/**
* Aliases for special pages of YourExtension extension.
*/

$aliases = array();

/** English
* @author YourName
*/
$aliases['en'] = array(
	'YourSpecialPage'          => array( 'YourSpecialPage' ),
);

At least the first instance YourSpecialPage should be the same as they key you used for declaring your special page with $wgSpecialPages. Note that WordPress likes to mangle quotes, so it is not safe to copy-paste verbatim from the above.

All this was committed today, so there may be some changes still, as always with brand new code. And the good news does not stop there. I already rewrote the Special:Magic of translate extension to support translating these! It already has two extension defined: Translate and Configure. The number of supported extensions will probably grow soon.

Project progress

Had to spend some time maintaining Betawiki, so the progress has been a little slow for the past week. Aside from that I’ve been working on many things.

I have setup a test project for Gettext: a Wesnoth campaign. It is now shown to all, just to us few testers who are going to translate it using Betawiki. It already helped to find some bugs, simplify the code and the edit view got support from displaying information extracted from the pot file.

To make a project available for translation it is not enough to only add it to the list. That part is easy to do, just checking out the files and about twenty lines of code. But to really support a project, we need to work closely with the development team and with the existing translation communities around it. It would be easy if we could just get everyone to use Betawiki immediately, but often some people don’t want to use the web interface for one reason or another. We need to setup rules which languages are translated and where, to avoid conflicts, map our language codes to what the project uses and setup some kind of integration process that translation actually get to the upstream, and that upstream changes propagate in a proper way back to us.

But back to the project. I have been reading Xliff format specifications. It’s fortunately quite short and clear and has nice examples. Xliff supports all kinds of nice features, and I have been trying to decide which features we need to support. I wrote a simple implementation that can export translations in minimalistic Xliff file. It was actuall a pretty easy to do, under 100 lines of code. It would be a really good to get someone who uses programs that accept Xliff files to comment which features would be useful to implement. In any case, I will implement a parser too this week, so that we can get those translations back too :).

If the test project doesn’t bring any big surprises, I start preparing to tackle the next task in the project schedule.

Unproductive start for a week

Well, maybe unproductive is a bit overstatement, but considering I didn’t advance much in my summer project it wasn’t very productive either.

Anyway, on Monday the internet connection was down for good number of hours. I got fed up with it and started cooking! I don’t usually cook to myself, so I’m not very good at it. It was tasty however, giving courage to do it more often. I spent the rest of the day playing Sid Meier’s Alpha Centauri with the extension disc. I love that game!

Oh, and FreeCol 0.7.4 was released yesterday (Monday). It didn’t go as well as I hoped. I was unable to commit latest changes done after Sunday before the release, because the connection was broken. I hope there wasn’t too much effort put into it after Sunday. Now that 0.7.4 is released and the branch officially dead, we have to finally migrate to the 8-branch. Most of the preparations have been already done. I wrote a script that tries to guess key mappings and other changes. So I have the list. In few days we will rename all FreeCol messages, changing to own namespace for FreeCol, removing the prefix and renaming old keys to new names. Keys have to be fixed in the files also. Except a short downtime when it is not possible to translate FreeCol while we do these changes.

And then Tuesday. On Monday Tim Starling committed a change to MediaWiki code that moved files around. (Tim is btw my summer project mentor, but this is unrelated). There was short breakage when Siebrand tried to update the code normally, and it was quickly reverted. Today I committed most of our local changes again: fix to comments; special page normalisation try 2; rewrite or Special:RecentChanges to add few hooks etc. I don’t know how many bugs I introduced yet again, but let’s hope not too many. Thanks to Ialex who already fixed few.

After that I put Betawiki into maintenance mode and started updating and merging changes. It really didn’t help that my local shitty Internet Service Provider had 35% of packet loss while doing it… ssh was irritatingly slow. I managed to do it in less than 10 minutes, and now we are back running, with less local changes.

Rest of the time was spent on eating (the same food as yesterday) and fighting with the papers to put an application for a new place… I have to get away from the dorm before my head explodes. For the evening I probably have to read up about XLIFF format to implement support for it, or test the Gettext implementation, or write some documentation for it, or something else… who knows.

As grain of salt, some nice features (or something like that) we got last week:

  1. Possibility to count fuzzy messages in group statistics
  2. Possibility to hide languages that have no translations at all
  3. First pieces of Gettext plural support
  4. Ability to blacklist some authors from the credits (mainly for bots or us who do maintenance like work)
  5. FreeCol got lots of optional messages
  6. Improvements to the RecentChanges filters that got implemented in the previous week and finally committed to svn repository today.