bug-gettext
[Top][All Lists]

## Re: [bug-gettext] Plural rule definitions

 From: Michele Locati Subject: Re: [bug-gettext] Plural rule definitions Date: Thu, 21 May 2015 12:48:42 +0200

2015-05-21 10:09 GMT+02:00 Daiki Ueno :
ja      nplurals=1; plural=0;
vi      nplurals=1; plural=0;
ko      nplurals=1; plural=0;
en      nplurals=2; plural=(n != 1);
de      nplurals=2; plural=(n != 1);
nl      nplurals=2; plural=(n != 1);
sv      nplurals=2; plural=(n != 1);
da      nplurals=2; plural=(n != 1);
no      nplurals=2; plural=(n != 1);
nb      nplurals=2; plural=(n != 1);
nn      nplurals=2; plural=(n != 1);
fo      nplurals=2; plural=(n != 1);
es      nplurals=2; plural=(n != 1);
pt      nplurals=2; plural=(n != 1);
it      nplurals=2; plural=(n != 1);
bg      nplurals=2; plural=(n != 1);
el      nplurals=2; plural=(n != 1);
fi      nplurals=2; plural=(n != 1);
et      nplurals=2; plural=(n != 1);

For the above locales, the tool of mine strips out the extra parenthesis.

he      nplurals=4; plural=(n==1 ? 0 : n==2 ? 1 : n>10 && n%10==0 ? 2 : 3);

Here I had an extra check (n < 0) for case 2 ((n<0 || n>10) && n % 10 == 0), I just removed it from the code of mine:

Another difference here is the use of parenthesis. As I described at https://github.com/mlocati/cldr-to-gettext-plural-rules#parenthesis-in-ternary-operators
I preferred to add some extra parenthesis to avoid problems in PHP, where the order of the ternary operator is quite strange: "A ? 0 : B ? 1 : 2" is interpreted as "(A ? 0 : B) ? 1 : 2", where as in other languages it's interpreted as  "A ? 0 : (B ? 1 : 2)".
So, in my code I have
he      nplurals=4; plural=(n == 1) ? 0 : ((n == 2) ? 1 : ((n > 10 && n % 10 == 0) ? 2 : 3))

eo      nplurals=2; plural=(n != 1);
hu      nplurals=2; plural=(n != 1);
tr      nplurals=2; plural=(n != 1);
pt_BR

Empty plural rules for Brazilian Portuguese? In CLDR, pt_BR is the same as pt (nplurals=2; plural=n > 1), whereas for pt_PT we should have another rule (nplurals=2; plural=n != 1)

fr      nplurals=2; plural=(n > 1);
lv      nplurals=3; plural=(n%10==0 || (n%100>=11 && n%100<=19) ? 0 : n%10==1 && n%100!=11 ? 1 : 2);
ga      nplurals=5; plural=(n==1 ? 0 : n==2 ? 1 : n>=3 && n<=6 ? 2 : n>=7 && n<=10 ? 3 : 4);
ro      nplurals=3; plural=(n==1 ? 0 : n==0 || (n!=1 && n%100>=1 && n%100<=19) ? 1 : 2);
lt      nplurals=3; plural=(n%10==1 && (n%100<11 || n%100>19) ? 0 : n%10>=2 && n%10<=9 && (n%100<11 || n%100>19) ? 1 : 2);

Except for the extra parenthesis, same as the results of the code of mine.

ru      nplurals=4; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<12 || n%100>14) ? 1 : n%10==0 || (n%10>=5 && n%10<=9) || (n%100>=11 && n%100<=14) ? 2 : 3);

The plural rules are 3 for Russian, not 4. I had the same strange result as you in an old version of my tool. It's quite a complicated case, but here's what the above function means:
- if n ends with 1 but not with 11: case 0 (named to "one" in CLDR for ru)
- if n ends with 2, 3 or 4 (but not with 12, 13 or 14): case 1 (named to "few" in CLDR for ru)
- if n ends with 0, 5, 6, 7, 8, 9, 11, 12, 13, 14: case 2 (named to "many" in CLDR for ru)
As you can see, the case 3  (named to "other" in CLDR for ru) never occur.
So, we should strip the case 3.
I discovered it because in the CLDR data there is no example for the "other" case (ie no "@integer" in the pluralRule node of plurals.xml).

Here's the approach that I used (quite pragmatical, I know):
So, for me the Russian rule should be
nplurals=3; plural=(n % 10 == 1 && n % 100 != 11) ? 0 : ((n % 10 >= 2 && n % 10 <= 4 && (n % 100 < 12 || n % 100 > 14)) ? 1 : 2);

uk      nplurals=4; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<12 || n%100>14) ? 1 : n%10==0 || (n%10>=5 && n%10<=9) || (n%100>=11 && n%100<=14) ? 2 : 3);

Exactly the same as Russian.

be      nplurals=4; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<12 || n%100>14) ? 1 : n%10==0 || (n%10>=5 && n%10<=9) || (n%100>=11 && n%100<=14) ? 2 : 3);

Exactly the same as Russian.

sr      nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<12 || n%100>14) ? 1 : 2);
hr      nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<12 || n%100>14) ? 1 : 2);
cs      nplurals=3; plural=(n==1 ? 0 : n>=2 && n<=4 ? 1 : 2);
sk      nplurals=3; plural=(n==1 ? 0 : n>=2 && n<=4 ? 1 : 2);

Except for the extra parenthesis, same as the results of the code of mine.

pl      nplurals=4; plural=(n==1 ? 0 : n%10>=2 && n%10<=4 && (n%100<12 || n%100>14) ? 1 : (n!=1 && n%10<=1) || (n%10>=5 && n%10<=9) || (n%100>=12 && n%100<=14) ? 2 : 3);

Similar to the Russian problem:
- n is 1 => case 0
- n ending with 2, 3 or 4 (but not 12, 13, 14) => case 1
- n ending with 0 or 1 (but not 1) => case 2
- n ending with 5, 6, 7, 8 or 9 => case 2
- n ending with 12, 13 or 14 => case 2
As you can see, the case 3 never occurs.
For me, the Polish rule is:
pl nplurals=3; plural=(n == 1) ? 0 : ((n % 10 >= 2 && n % 10 <= 4 && (n % 100 < 12 || n % 100 > 14)) ? 1 : 2);

sl      nplurals=4; plural=(n%100==1 ? 0 : n%100==2 ? 1 : n%100>=3 && n%100<=4 ? 2 : 3);

Except for the extra parenthesis, same as the results of the code of mine.

Ciao!
--
Michele