Category Archives: vapaasuomi

Goes to vapaasuomi planet

FOSDEM talk reflections 3/3: HipHop, communities, public procurement

Nikerabbit arrives at the MediaWiki meetup at FOSDEM

Meetup of MediaWiki community. Or Wikimedia tech? How to call the Wikimedia software development ecosystem? (Photo by henna, copyright status unknown.)

This is the third post about FOSDEM 2013; see 1/3: I18n in the WEB, Mozilla i18n and L20n for the first and 2/3: docs, code and community health, stability for the second. Links to the abstracts in the headers.

Scaling PHP with HipHop

HipHop is still alive, and faster than ever. It has evolved from PHP to C translator to a JIT bytecode interpreter system, just like PHP itself is, without JIT of course. The speedups they are seeing are impressive (it was deployed on Facebook about a month ago). Given that they have removed the compile everything before deploy step, it is now much more feasible to use.

I’m considering to give HipHop a try on translatewiki.net later this year, probably after we have upgraded to at least Ubuntu 12.04, where Facebook provides packages for it. It is still a pain in the ass to set up manually, as it was few years ago. Wikimedia Foundation (WMF) has dropped its evaluation, but perhaps they will reconsider after our experiences, and HipHop, or hhvm as it is called now, has indeed changed a lot since then.

It was highlighted that the supported language features and libraries of hhvm and PHP vary to an extent. hhvm provides some nice features like strict type hinting, but it is unlikely I can use those anytime soon, since there is no way to take advantage of these on hhvm without breaking support for normal PHP, which is something that really cannot be done in the MediaWiki ecosystem.

Community/BOF meetup

Almost 20 people were around, a few outside of WMF. Discussions circled around events like the Amsterdam Hackathon and MediaWiki groups. The most interesting part (to me) is how to call the Wikimedia software development ecosystem, so it can be marketed properly. Suggestions ranged from extending the meaning of MediaWiki to cover everything including mobile, gadgets and so on; using Wikipedia as it is the brand most well known; or creating a new Wikitech brand.

There are pros and cons to each of the above, but one thing is true: There is no name that can currently be used to refer to everything technical done around MediaWiki and Wikimedia that would also be understood by potential participants. Also, MediaWiki development is not perceived to be cool anymore, because it’s PHP. But it isn’t. MediaWiki development is also Redis, Varnish, puppet, git, Solr, HipHop, semantic, node.js, mobile, OpenStack, and more. Quim Gil will continue work on coming up with a brand. Curious visitors can also compare this to what KDE did recently when they expanded the meaning of KDE to be not only the desktop, but also the community and everything they do. The change process wasn’t painless for them, and wouldn’t be for us, but at the same time (IMHO) the change has been quite successful and beneficial to KDE.

Qt Project Update

Qt booth at FOSDEM

Qt: a maintainer for each subsystem helps getting your patch reviewed, unlike MediaWiki

Qt is doing well. Qt5 is evolution instead of revolution (what Qt4 was to Qt3). The contributors outside of Nokia (and nowadays Digia) have risen to about one third of all commits. They are using Gerrit like MediaWiki. But unlike MediaWiki they have explicit hierarchy, with maintainers who are responsible for keeping each subsystem in shape. It also means that there is always at least one person you can talk to, to get your patch reviewed, unlike in the MediaWiki community.

Their platform support is also nice: Linux, Windows and OSX are fully supported, while iOS, Android and BlackBerry OS are also working more or less.

Fixing public procurement

Forgive me if I use incorrect terms. In a nutshell there is a law in Finland (coming from the EU) that disallows governmental organizations to request software systems by referring to an exact producer. So for example a hypothetical example “We want Microsoft Office on all our work stations in X department” is illegal, while “We want an office tools suite that includes documents, presentations, …” is legal. Free software people in Finland did an analysis of how many times this law has been violated… quite many… and have been sending letters that ask them to read the rules and fix their procurement.

The talk continued with the observation that there is no entity to enforce this rule, and that it is difficult to get the companies put into disadvantage to sue. One side argued that fixing this particular issue harms doing wide scale education of people on this and other issues. Or when is it useful to sue instead of trying to educate? When are the lost opportunities bigger a harm than bad publicity and money spent on suing? Apparently Microsoft has sued successfully in Finland to gain lots of money without a big PR hit. Open source solutions are usually discarded because the exit costs of the previous system are placed on the new solution instead of the old vendor lock-in solution.

All in all, the target of this kind of work is to further open source use in governments by allowing free competition; they want to do it EU-wide.

The Keeper of Secrets: The Dance of Community Leadership

FOSDEM party crowd

FOSDEM preparty was definitely not a quiet beerless one

This is the first time I’ve seen Leslie Hawthorn speaking. In her talk, which was full of beer jokes (a bit too many to my taste), I caught some points:

  • Don’t be a jerk.
  • Stop gossiping and talk directly to people you have problems with.
  • Don’t ignore difficult people, be brave enough to let them know your honest opinion.
  • Don’t be a jerk while talking about difficult things with people, do it politely and cooperatively.
  • Face to face meetings are essential to community building (my addition: also include possibilities do it in beerless, quiet places).

FOSDEM talk reflections 2/3: docs, code and community health, stability

This is the second post about FOSDEM 2013; see 1/3: I18n in the WEB, Mozilla i18n and L20n for the first. Links to the abstracts in the headers.

Open Sourcing Documentation

Don’t keep documentation to yourself, release it with an open license. Others might see the forest from the trees when you are too close to the problem. Translators can translate the documentation, but this needs proper tools, something which was not mentioned in the presentation.

Also mentioned in the presentation, webplatform.org (greetings sent to Ryan Lane who helped to build it) and the problems of Mozilla having to support both webplatform and its own MDN, where the latter has wider scope and more restrictive license than the former.

Coping with the proliferation of tools within your community

FOSDEM entrance

Entering the MediaWiki community and contributing to MediaWiki is hard for many reasons

I got the impression that XWiki is the everything and a kitchen sink of wikis. It has several nice points related to having everything in one place (one wiki):

  • Only one place to have account and one place to sign in.
  • Can search everything like bugs, commits, IRC logs and documentation at once.

In my opinion it’s a nice starting point for projects, but in the end it’s also about the quality of individual tools that matters when projects grow. I don’t see how MediaWiki could replace Gerrit or Bugzilla with something provided by XWiki.

XWiki is among one of the candidates to take over MediaWiki if MediaWiki is not able to revitalize its community by improving the extension and gadget development ecosystem. XWiki being written in Java can be a deterrent for some over PHP (but the opposite is certainly true, too).

How we made the Jenkins community

It’s not enough to lower the barriers, the barriers must be removed wherever possible. While Wikipedia is/was quite open and easy to use (not going too deep into that), MediaWiki development had and still has many barriers. It was not long ago that getting commit access took ages and you had to present your CV. Nowadays commit access is easier to get because we use Gerrit to review code before it is merged. But Gerrit is quite a complex beast and we still don’t have GitHub integration to accept drive-by patches. Lack of documentation and difficulty of getting patches reviewed is also a problem, not to mention the lack of a shared vision where MediaWiki should go.

Another problem that I think is not being realized is that MediaWiki code, while not as bad as it was, is in my humble opinion not improving fast enough. MediaWiki core is a huge monolithic piece of code, entangled in many places to the extent that it is impossible to extend without refactoring the affected code first. This greatly affects the innovation and creation of new extensions and thus is a problem for the MediaWiki ecosystem. The core code needs to become cleaner and more modular. The modularity and quality of Jenkins APIs seemed to be a major reason for the applaudable growth of its community.

Wish: Can we have an integrated search that provides results from mediawiki.org, Bugzilla, IRC logs, mailing lists and other relevant places and not just a service on labs known to the lucky few? There is also lots of stuff in etherpads, Google documents and even on blogs that will probably never be searchable.

Improving Stability of Mozilla Products

Mozilla booth at FOSDEM

Something we lack compared to Mozilla is the crash metrics (note the diversity at their booth)

This talk made me wish for information on what actual problems MediaWiki users and developers are facing. Unfortunately, due to the nature of PHP, fatal errors and warnings can be caused by so many things, including syntax errors users made in their LocalSettings.php, that the data would need lots of filtering. There are also privacy concerns, but it should be possible to do something by collecting exceptions and JavaScript errors and aggregating them for some selected few to see. This would help to prioritize bugs, as Bugzilla and IRC channels are bad indicators of how many people are actually encountering a particular issue.

I’d like to be optimistic, but the best thing we have so far is translatewiki.net, which collects PHP errors, warnings and exceptions as well as JavaScript errors. However, aside from the PHP warnings and notices that are announced on IRC in #mediawiki-i18n, that information is only available to me and few others, and after all translatewiki.net is just a minor piece of the total system. One of my personal wishes is to have more priority given to the issues affecting translatewiki.net, as experience learns us that issues discovered on translatewiki.net, will most often also surface as issues in Wikimedia wikis. Mozilla can state things like: This bug affects 10% of all active users or hundred million users every day. Can we please have that, too?

It would be so awesome if the Wikimedia Foundation could release similar kinds of information to the wider public.

Shorts

IonMonkey: Yet Another JIT Compiler for JavaScript?

Why is yet another JIT compiler needed? Because the new one is better! Sometimes it makes sense. IonMonkey is based on well-understood techniques like static single assigment.

WebRTC: Real time web communication

The open source technology stack to kill Skype and Google Hangout is in process, but will still take a while to reach a browser near you. Check out http://reveal.rs.af.cm/ with a recent version of Google Chrome for a preview of what can also be accomplished with WebRTC.

PDF.js – Firefox’s HTML5 PDF Viewer

This is something that the Wikimedia Foundation could perhaps use too, though limited browser support is an issue.

The presentation had some nice tidbits on how the PDF standard includes everything and a kitchen sink like a 3D model viewer. It also explained what is easy to port to HTML5 (many things can be done with Canvas specification) and what is not. HTML5 specifications are going to see additions due to this work.

Changesets evolution with Mercurial

A cool idea: Tracking the changes to the commit history itself, so you can alter the history without deleting any parts of it. Questionable though if this will see wide spread use, and it’s only (coming to) in Mercurial now, not in Git.

FOSDEM talk reflections 1/3: I18n in the WEB, Mozilla i18n and L20n

FOSDEM 2013 t-shirt

FOSDEM 2013 was attended by several Wikimedians.

Now that I’ve slept over the presentations I attended at FOSDEM, it’s a good time to think about what I heard and how it related to what I am doing. It is also a good time before I forget what I heard. I didn’t get to talk to that many people this year, mostly running from one talk to another.

There will be three parts to the series of these blog posts. I will start with i18n related topics and then other presentations roughly in the order I saw them (headers link to abstracts).  There will also be a follow-up post on the gettext format detailing the good and bad sides from today’s point of view. Stay tuned!

An Integrated Localization Environment

Mozilla keeps pushing new i18n stuff, though the general feeling of this and other related talks is that they either have not defined what is the issue they are fixing, or they have defined in a way that is completely different from what we are working on.

While we are trying to make it as easy as possible for translators to translate (in a technical sense, they already have enough of complexity due to language itself), the ILE proposed in this talk is essentially a IDE (integrated development environment) – a glorified text editor that programmers often use for programming. It has features like highlighting syntax via colors and automatic completion for translation file syntax.

But do translators really care about particular syntax of translation in a file, or are they in fact more happy if they do not need to care about files and version control systems at all, while at the same time having access to aids like translation memories and change tracking in an interface created by UX designers, as we have in translatewiki.net?

“It helps to see the messages above and below to understand the context”
You can see the related messages close to each other in almost any translation tool, even though showing related messages next to each other is not a replacement for proper documentation of context for each message.

“I don’t see how form based translation tools would cope with more complex localisation file formats like L20n”
I don’t think the solution to facilitating proper localisation is to turn the localisation itself into programming. The cases where more complex logic is needed are actually relatively few and I think it is worthwhile to keep the common case as simple as possible while supporting also the more complex cases in a standardized, data driven way, like using the CLDR.

L20n

Mozilla presenters at FOSDEM


Mozilla keeps pushing new i18n stuff: who is the user they are designing new tools for?

This talk was an update to the similar presentation on L20n last year. What I said on the previous post about turning localisation into programming applies here too.

It is nice that you specify grammatical gender for things, but this format does not really solve the problem that many variables actually come from user input, for which we cannot specify this information.

It is nice that you can make custom plural rules, but in almost all cases the standard set of plural rules that comes from standards like CLDR is enough.

It is nice that you can mix gender and plural and even many plural in one message using nested hashes (arrays in PHP), but it is not nice at all that you have to translate the message N*M*O times as the number of variables increase. I firmly believe that inline syntax like {{GENDER:$1|he|she}} eats {{PLURAL:$2|apple|apples}} is superior in this regard.

If we strip the plural, gender, time formatting etc. support from L20n, we actually just get a complex file format for storing things, something which we already have many variants of. The aforementioned features are usually provided by the i18n library (or definitely should be; unfortunately this is not always the case) so what they have done is actually moving the complexity of language from i18n libraries and software developers to translators. Aiming at “keep common case simple, but support complex cases where needed”, I don’t think this is as presented a good trade-off between simplicity and flexibility.

webL10n: client-side i18n / l10n library

This talk was about adapting some nice parts of L20n to .properties format. The result is somewhat more complex than plain .properties and not as flexible as L20n. Even having gender and plural in the same message is problematic in this format.

I’d like to highlight two ideas in webl10n. Sidenote: Why call it l10n when it is actually an i18n library for developers, similar to jquery.i18n.

The first idea is that you can have html like this:

<div l10n-data-id=retro>
<div>Please <a href="login/">log in</a></div>
</div>

And the translators see this:

retro = <div>Please <a>log in</a></div>

The translation, when displayed, is properly merged to the original html so that the classes and link targets are preserved. I don’t know what happens if the translation is outdated and the structure is changed, but I guess we just should not use outdated translations with this system. When escaping is handled properly, this is a very nice way to handle what we call lego messages, where the text of the link is in a separate message, because due to escaping we can’t have link and link text in the same message.

Another idea is that if you have HTML like this:

<input type="search" placeholder="Search messages" title="Message search box">

You can turn it to this.

<input type="search" data-l10n-id="searchbox">

And translators will see this (using .properties format here)

searchbox.placeholder=Search messages
searchbox.title=Message search box

This simplifies the html the developers need to write.

Finally, take a look also at Pau’s Design talks at FOSDEM 2013.

New language stuff for developers and users

MLEB 2013.01 has been released by Amir Aharoni. Lots of development has been happening in Translate due to the work on the new translation interfaces. If you are a developer, please also checkout the latest new and changed Web APIs and give us a shout in #mediawiki-i18n @freenode if you see something obviously wrong or missing. Also included in this release are bug fixes for Universal Language Selector, while the other included extensions didn’t see many changes.

Some months ago I wrote about Language tag validation in MediaWiki. A nice person named Siebrand Mazeland decided to improve the situation. As of now we have three new methods developed by the Wikimedia Language engineering team:

  • isSupportedLanguage
  • isWellFormedLanguageTag
  • isKnownLanguageTag

Unless these methods are backported to MediaWiki 1.19 and MediaWiki 1.20, it will take a while before these are being used in extensions, but after a while we should see faster and more readable code.

MediaWiki Language Extension Bundle 2012.12

MediaWiki language extension bundle 2012.12 was released just before Christmas. It is compatible with MediaWiki versions 1.20 and 1.21alpha. Downloads and installation instructions can be found at https://www.mediawiki.org/wiki/MLEB. Announcements of new releases will be posted to mediawiki-i18n mailing list.

Here are the highlights:

cldr

  • English name for Azerbaijani (arz) was added.
  • A bug that caused local names for be-tarask not to be used was fixed.
  • Translations for be-tarask were updated.

Translate

Lots of development is ongoing on the translation user interface redesign project conducted by the WMF Language Engineering Team. New message list and translation editor (pictured) are in alpha stage, but interested users can activate them by using URL parameter tux=1 while on Special:Translate. Also, tux=0 gets back the old interface.

$wgTranslateAC and $wgTranslateEC were removed. If you were still using these, switch over to the TranslatePostInitGroups hook or $wgTranslateCC.

Bundled Solarium library was removed. Install it manually or use the MediaWiki Solarium extensions.

The new group selector (top) and part of the revamped translation editor.

Sneak peek from the new translation UX: the new group selector (top) and part of the revamped translation editor.

Other noteworthy changes in Translate:

  • ApiQueryMessageGroups module has lots of new functionality.
  • There is new ApiQueryLanguageStats module.
  • (bug 39761) Special:TranslationStats counts for edits includes also reviews
  • GettextFFS: Handle empty but existing msgctxt properly
  • New hook: TranslateSupportedLanguages

Universal Language Selector

  • Fixed a display issue in the Modern skin.
  • (bug 42382) Indicate context in input settings/more languages

MediaWiki Language Extension Bundle launches!

The Wikimedia Language Engineering team is pleased to announce the first release of the MediaWiki Language Extension Bundle. The bundle is a collection of a few selected MediaWiki extensions needed by any wiki which desires to be multilingual.
This first bundle release (2012.11) is compatible with MediaWiki 1.19, 1.20 and 1.21alpha.
The Universal Language Selector is a must have, because it provides an essential functionality for any user regardless on the number of languages he/she speaks: language selection, font support for displaying scripts badly supported by operating systems and input methods for typing languages that don’t use Latin (a-z) alphabet.
Maintaining multilingual content in a wiki is a mess without the Translate extension, which is used by Wikimedia, KDE and translatewiki.net, where hundreds of pieces of documentation and interface translations are updated every day; with Localisation Update your users will always have the latest translations freshly out of the oven. The Clean Changes extension keeps your recent changes page uncluttered from translation activity and other distractions.
Don’t miss the chance to practice your rusty language skills and use the Babel extension to mark the languages you speak and to find other speakers of the same language in your wiki. And finally the cldr extension is a database of language and country translations.
We are aiming to make a new release every month, so that you can easily stay on the cutting edge with the constantly improving language support. The bundle comes with clear installation and upgrade instructions. It is tested against MediaWiki release versions, so you can avoid most of the temporary breaks that would happen if you were using the latest development versions instead.
Because this is our first release, there can be some rough edges. Please provide us a lot of feedback so that we can improve for the next release.

Performance tuning translatewiki.net

One of the biggest advantage of desktop translation tools is that they don’t have delays rendering the interface – at least not in such a scale as websites have. In translatewiki.net it is crucial that our pages load very fast. In certain places we can and do use intelligent preloading to remove the delays, in other places we have to employ complex caching algorithms to reach that target. I am regularly monitoring the automatically collected profiling information to avoid regressions and to pick low-hanging fruit from time to time.

In the last sprint my main task was to convert the way we handle the translation of MediaWiki extensions in translatewiki.net to use the same processes and interfaces as pretty much everything else. MediaWiki and MediaWiki extensions were the first things supported in translatewiki.net and now they are among the last things to get modernized to take advantage of better interfaces built on the years of experience supporting various kinds of products.

The only user visible change is improved performance. The new interfaces are more efficient and enable more optimizations, which allows us to deliver faster page views and scale to more messages. It will also simplify the work of translatewiki.net staff, as they don’t need to follow two different processes, especially after we update also MediaWiki translation code.

Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.

As a developer I’m proud that the new code is unit tested. The culmination, however, was a change which removed hundreds of lines of old code: in fact, the above quote applies to software development too.

For those interested in details, the biggest performance boosts were achieved by avoiding the need to parse the translation files in many places – the list of message keys and their values are stored in intermediate cache files in CDB format. In addition there were many smaller performance optimizations, like not using some MediaWiki method to construct a link element, which consumed 20 kilobytes of memory for each link. When there are thousands of links, it adds up and is excessive for just making some hundred bytes of output. I switched it to a more low level method (memory usage: from 175 to 12 MB).

Some low-hanging fruit might not be as easy to pick as it seems at first. (Photo CC-BY-SA by Asit K. Ghosh.)

At the time of writing I still have some more fixes pending further testing and cleanup. For example, to access any message group, those all have to be loaded. They are cached as serialized PHP objects, but loading them takes 20 milliseconds and 10 megabytes of memory. I’m working on making it possible to load cached message groups individually.

The website anyone can translate

Translatewiki.net has started using Puppet. Puppet is a tool designed to manage the configuration of servers. Like Wikimedia’s, our configuration is public and stored in the translatewiki.net git repository, where anyone can submit patches. I don’t expect a flood of them coming in anytime soon, my motivations for this were different. If you remember, some months back I had to learn some Puppet to write the Solr configuration for Wikimedia deployment. Now I wanted to learn more and gather more experience on using Puppet. It will also greatly help if we ever need to reinstall the translatewiki.net server from scratch (which is quite likely to happen soon). As a bonus it gives transparency and something I can refer people to when they ask how some particular thing is done in translatewiki.net. As time permits, I will be moving more configuration to Puppet.

Mitä isot edellä, sitä pienet perässä. (Internet suggest the closest translation is Monkey see, monkey do.)

I also added the translatewiki.net repository to Ohloh. If you use translatewiki.net as localisation platform, feel free to add it to your stacks by clicking “I use this”, or to embed its widgets in your website. Ohloh also gives some cool stats:

In a Nutshell, translatewiki.net…

Together with the introduction of Puppet, I also switched the webserver of translatewiki.net from lighttpd to nginx. The biggest reason for this is that https was broken for Google Chrome users, but in general nginx feels faster and more robust and the way PHP is used with it is much simpler (php-fpm instead of spawn-cgi). The Wikimedia operations team is supposedly going to test nginx soon, so we will see whether the tide also goes that way.

Muir Woods has one tree – plural issues in MediaWiki

While I was having fun with the rest of the Wikimedia I18n team in San Francisco, a stream of plural related bug reports started coming in. The cause is that we have recently scrapped the custom plural rules in MediaWiki in favor of using plural rules from the CLDR database. A temporary fix has been applied to mitigate the reported issues.

The problem manifestation is pretty simple; in some languages in some contexts the message was always one something. For example the category page would say This category has one page regardless of how many pages there were in it. At first I was baffled. After all we had written unit tests for all languages in MediaWiki and they reported no regressions. Turns out we had ignored one particular set of languages: those which don’t always use plurals and had no plural rules defined in MediaWiki. The problems started when those language used plural even though they weren’t supposed to. When plural rules are not defined for a language, those languages use the plural rules as defined for the English language: 1 book, 2 books. In CLDR, however, some languages have been defined to not use any plural rules at all.

We could blame the translators for using plural syntax when they are not supported, or we could blame the CLDR for having no plurals rules for languages which do use plurals in some cases. It is not that simple, however. The typical example is a language which doesn’t have distinct plural forms (like some words in English: 1 fish, 2 fish; but for all nouns), but do use plural quantifiers if the number is not present: one fish, many fish.

As a compromise I have proposed an extension to the plural syntax to allow specifying the output when the number is 0 or 1 regardless of the usual plural rules for that language. Let’s take a real example:

Accepted by {{PLURAL:$1|you|$1 users including you}}.

This works fine in English, because the first form is always for number 1. In Belarusian it doesn’t work, because the first form is used for number 1, but also for numbers 21, 31, 41 etc. It could be solved by the following syntax:

{{PLURAL:$1|1=you|$1 users including you}}.

The slightly confusing part here is that now the second form is actually the singular form. This is more evident in the imaginary Belarus translation:

{{PLURAL:$1|1=you|one|few|many|other}}

"you" is used for number 1, “one" for 21, 31, 41 but not 1, and the remaining forms as they usually are.

The explicit zero form (0=something) can also be useful for English and many other languages to have a different wording – something which is now usually done with separate messages.

The message used above is from the Translate extension. Unfortunately we cannot start using this syntax until we have dropped backwards compatibility with the last MediaWiki version not supporting  this syntax i.e. 1.20, which would be around when MediaWiki 1.22 is released. We are seriously considering to backport this functionality, but we also need to add support for the same syntax in JavaScript first.

During further testing we also found issues in Hebrew plural rules. The position of dual was changed and we didn’t notice it because the unit tests were wrong. This resulted in problems like the login page saying Remember my login for two days. It just helps reminding how bugs in i18n can cause potentially severe issues.

Niklas in Muir Woods.

Niklas in Muir Woods. Testing new counting methods? (Photo by Pau Giner.)

Finnish translation sprint 2012-06 KDE – results

English summary: Using the MediaWiki Translate extension, the KDE SC 4.9 release was collaboratively translated into Finnish during the midsummer. The goals of the translation sprint were to produce 10,000 new proofread translations and to unify the translations of many common terms. The goals were mostly met. Due to problems of counting the number of new translations, we had to change the measure to include also updated translations. We made 10,643 new or updated translations and over 8,250 translations were proofread. I believe the combined effort makes a visible difference, though in absolute terms there is more work to do for tens of more sprints.


Kesä-heinäkuussa järjestettiin KDE:n käännöstapahtuma, jossa käännetiin kaikkea KDE:sta, mutta erityisesti KDE SC 4.9 -julkaisua, josta saimme kaikki tärkeimmät osat tehtyä. Muitakin KDE-ohjelmia suomennettiin ja suomennoksia parannettiin. Erityisesti KDE:n verkkosivujen suomentamisessa päästiin eteenpäin. Aiemmin oli suomennettu vain osa KDE:n sivustojen yhteisistä viesteistä, mutta nyt koko Join the Game on suomeksi sekä muidenkin sivujen suomentamista on aloitettu.

Toinen tärkeä saavutus oli käännösten laadun parantaminen oikolukemalla tärkeimpien ja näkyvimpien KDE-ohjelmien viestejä sekä yhtenäistämällä käytettyjä termejä ja käymällä läpi yleisiä kehnouksia. Jotkut myös innostuivat korjailemaan heitä jo pidempään vaivanneita asioita, joita ei vain ollut aiemmin tullut tehtyä. IRC-kanavakin aktivoitui ja keskustelua käännöksistä syntyi.

Jäimme kuitenkin alkuperäisestä 10 000 uuden käännöksen tavoitteestamme: täysin uusia oli 6 315. Tämä luku ei pidä täysin paikkaansa, sillä siinä ei ole mukana sumeita viestejä. Sumeat viestit sisältävät esitäytetyn käännöksen, joka saattaa joko vaatia pientä viilausta tai on ihan väärin. Parempi luku saadaan, jos mukaan lasketaan parannetut suomennokset, jolloin tulokseksi saadaan 10 643 uutta tai parannettua käännöstä. Oikolukeminen ei ollut aivan yhtä suosittua, mutta silti yli 8 250 käännöstä oikoluettiin.

Monet kokonaisuudet jäivät myös kesken, joskus pientä vaille. Termien yhtenäistäminenkin jäi kesken, vaikka se olikin vain toissijainen tavoite ja tiedetysti turhan iso pala. Siitä huolimatta useita termejä saatiin yhtenäistettyä – työurakka on vain erittäin suuri ja liian suuri tehtäväksi kerralla.

Käännösten oikolukeminen Translatella todella toimi. Varsinkin verrattuna nykyiseen tapaan, jossa kääntäjät lähettelevät sähköpostilla kokonaisia Gettext po-tiedostoja eikä edes laadunvalvontaan tarkoitettu postituslista ole käytössä. Työkalun perustoiminnallisuus oli kunnossa: mikään ei juuri häirinnyt kääntämistä. Pientä hitautta tosin ole havaittavissa; syynä siihen lähinnä käännösmuistiominaisuus.

7–8 uutta ihmistä saatiin mukaan, joista muutama teki kymmeniä käännöksiä ja loput selvästi enemmän. Kaikki vastaan tulleet tekniset ongelmat tilastosivun skaalautumisongelmista käännösten po-tiedostoihin viennin pieniin korjauksiin saatiin ratkottua hyvin.

Edelleen pohdituttaa, miten saisi enemmän ihmisiä mukaan. Myös työkalun tuonnin ja viennin suhdetta SVN:ssä tehtyihin muutoksiin täytynee miettiä ennen kuin työkalua voisi ottaa käyttöön nykyisen po-tiedostojen sähköpostilla lähettelyn rinnalle. Automaattisesti vientiä ei kuitenkaan voi tehdä, koska KDE:n SVN-tilin käyttösäännöt estävät sen.

Tulevaisuudessa mietimme työkalun vakituisempaakin käyttöä. Suurin ongelma on herättää lokalisointi.org uudelleen henkiin ja yhdistää siellä olevat termit käännösalustaan.

 

Haluan kiittää kaikille osallistujia käännöstyöstä ja palautteesta alustan toimivuuden suhteen. Erikseen kiitokset Lasse Liehulle tämän yhteenvedon raakaversion koostamisesta.