Muir Woods has one tree – plural issues in MediaWiki

While I was having fun with the rest of the Wikimedia I18n team in San Francisco, a stream of plural related bug reports started coming in. The cause is that we have recently scrapped the custom plural rules in MediaWiki in favor of using plural rules from the CLDR database. A temporary fix has been applied to mitigate the reported issues.

The problem manifestation is pretty simple; in some languages in some contexts the message was always one something. For example the category page would say This category has one page regardless of how many pages there were in it. At first I was baffled. After all we had written unit tests for all languages in MediaWiki and they reported no regressions. Turns out we had ignored one particular set of languages: those which don’t always use plurals and had no plural rules defined in MediaWiki. The problems started when those language used plural even though they weren’t supposed to. When plural rules are not defined for a language, those languages use the plural rules as defined for the English language: 1 book, 2 books. In CLDR, however, some languages have been defined to not use any plural rules at all.

We could blame the translators for using plural syntax when they are not supported, or we could blame the CLDR for having no plurals rules for languages which do use plurals in some cases. It is not that simple, however. The typical example is a language which doesn’t have distinct plural forms (like some words in English: 1 fish, 2 fish; but for all nouns), but do use plural quantifiers if the number is not present: one fish, many fish.

As a compromise I have proposed an extension to the plural syntax to allow specifying the output when the number is 0 or 1 regardless of the usual plural rules for that language. Let’s take a real example:

Accepted by {{PLURAL:$1|you|$1 users including you}}.

This works fine in English, because the first form is always for number 1. In Belarusian it doesn’t work, because the first form is used for number 1, but also for numbers 21, 31, 41 etc. It could be solved by the following syntax:

{{PLURAL:$1|1=you|$1 users including you}}.

The slightly confusing part here is that now the second form is actually the singular form. This is more evident in the imaginary Belarus translation:

{{PLURAL:$1|1=you|one|few|many|other}}

"you" is used for number 1, “one" for 21, 31, 41 but not 1, and the remaining forms as they usually are.

The explicit zero form (0=something) can also be useful for English and many other languages to have a different wording – something which is now usually done with separate messages.

The message used above is from the Translate extension. Unfortunately we cannot start using this syntax until we have dropped backwards compatibility with the last MediaWiki version not supporting  this syntax i.e. 1.20, which would be around when MediaWiki 1.22 is released. We are seriously considering to backport this functionality, but we also need to add support for the same syntax in JavaScript first.

During further testing we also found issues in Hebrew plural rules. The position of dual was changed and we didn’t notice it because the unit tests were wrong. This resulted in problems like the login page saying Remember my login for two days. It just helps reminding how bugs in i18n can cause potentially severe issues.

Niklas in Muir Woods.

Niklas in Muir Woods. Testing new counting methods? (Photo by Pau Giner.)

-- .

One thought on “Muir Woods has one tree – plural issues in MediaWiki

  1. Nemo

    As for “The slightly confusing part here is that now the second form is actually the singular form”, silly question: I don’t understand, why can you just make the 1= parameter the last one?

Comments are closed.