Re: [bug-gettext] msgmerge confuses unrelated entries

bug-gettext

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gettext] msgmerge confuses unrelated entries

From:	Loic Dachary
Subject:	Re: [bug-gettext] msgmerge confuses unrelated entries
Date:	Thu, 14 Sep 2017 23:42:17 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

Hi Bruno !

It is a great feeling, when filing a bug, to get an immediate response that 
shows a great attention to details :-)

On 09/14/2017 10:06 PM, Bruno Haible wrote:
> Hello dear Loïc,
> 
>> Given the attached messages.pot and messages.po, when calling:
>>
>>     msgmerge --previous --update messages.po messages.pot
>>
>> it produces the attached messages.po.updated file. A diff between 
>> messages.po and messages.po.update shows:
>>
>> -#: journalist.py:237 journalist.py:453
>> -msgid "Two-factor token failed to verify"
>> -msgstr "Échec de vérification du jeton de la validation en deux étapes"
>> +#: journalist.py:237 journalist.py:455
>> +#, fuzzy
>> +#| msgid "Reset Two-Factor Authentication"
>> +msgid "Could not verify token in two-factor authentication."
>> +msgstr "Réinitialiser la validation en deux étapes"
>>
>> which incorrectly associates "Reset Two-Factor Authentication" as the 
>> previous version of the "Could not verify token in two-factor 
>> authentication." string and therefore swaps the translations. The 
>> messages.po file has the following entry, unmodified in messages.po.updated
>>
>> #: journalist_templates/edit_account.html:53
>> msgid "Reset Two-Factor Authentication"
>> msgstr "Réinitialiser la validation en deux étapes"
>>
>> The correct behavior would be to keep the original translation. It is always 
>> incorrect to assume two different phrases (in this case "Could not verify 
>> token in two-factor authentication." and "Reset Two-Factor Authentication") 
>> have exactly the same translation.
> 
> There are apparently misunderstandings about what msgmerge does and what
> you can expect from it.
> 
> 1) The updated messages.po file contains messages marked with "#, fuzzy".
> These will be ignored by tools that make the translations available to
> programs (msgfmt and such).

I did not know that and it makes total sense.
>
> The "incorrect associations" between strings that you refer to are therefore
> void, unless the translator has modified/validated them.

Understood. 

> 2) The updated messages.po file is meant for review / translation update by
> the translator. The "#, fuzzy" mark is a hint to the translator, meaning
> "look here, here is something to do for you". The translator is supposed
> to modify the translation and then only remove the "#, fuzzy" mark.

These files are fed to weblate ( 
https://weblate.securedrop.club/projects/securedrop/securedrop/fr/ to be 
precise ) and the translator is indeed prompted to review all strings marked 
fuzzy.

> 3) The effect of the --previous option, namely "#| msgid ...", is to help
> the translator. In your example,
> 
> #: journalist.py:107
> #, fuzzy
> #| msgid "You must be an administrator to access that page"
> msgid "Only administrators can access this page."
> msgstr "Vous devez être administrateur pour accéder à cette page"
> 
> the translator may notice that the old and new msgid are semantically
> the same; this will help her decide what to do about the translation.

weblate makes good use of this previous phrase by showing a precise diff that 
clearly highlights the simplest changes such as the change of the case of one 
character.

>> The messages.po file has the following entry, unmodified in 
>> messages.po.updated
>>
>> #: journalist_templates/edit_account.html:53
>> msgid "Reset Two-Factor Authentication"
>> msgstr "Réinitialiser la validation en deux étapes"
>>
>> The correct behavior would be to keep the original translation.
> 
> msgmerge has kept the original translation: you are saying yourself that
> this message is unmodified in messages.po.updated.
> 
>> It is always incorrect to assume two different phrases (in this case
>> "Could not verify token in two-factor authentication." and
>> "Reset Two-Factor Authentication") have exactly the same translation.
> 
> No one is making such an assumption. 

I think this is what happened in the example I sent. I apologize for not 
explaining clearly and presenting the information in a confusing manner. The 
original file has:

...
#: journalist.py:237 journalist.py:453
msgid "Two-factor token failed to verify"
msgstr "Échec de vérification du jeton de la validation en deux étapes"
...
#: journalist_templates/edit_account.html:53
msgid "Reset Two-Factor Authentication"
msgstr "Réinitialiser la validation en deux étapes"
...

and after running msgmerge it becomes:

...
#: journalist.py:237 journalist.py:455
#, fuzzy
#| msgid "Reset Two-Factor Authentication"
msgid "Could not verify token in two-factor authentication."
msgstr "Réinitialiser la validation en deux étapes"
...
#: journalist_templates/edit_account.html:53
msgid "Reset Two-Factor Authentication"
msgstr "Réinitialiser la validation en deux étapes"
...

And this line:

#| msgid "Reset Two-Factor Authentication"

is incorrect. The previous string was

msgid "Two-factor token failed to verify"

Probably because of that confusion, the suggested translation is the one 
associated with the incorrect previous string.

> But it is frequent, when a program
> evolves, that a message gets duplicated and modified. For example:
> 
>   msgid "The server did not accept your credentials."
>   msgstr "Le serveur n'a pas accepté votre identité."
> 
> could become, in the next .pot file
> 
>   msgid "The server did not accept your user name."
>   msgstr ""
> 
>   msgid "The server did not accept your password."
>   msgstr ""
> 
> It *will* help the translator to have the same old translation appear
> twice among the proposed translations:
> 
>   #, fuzzy
>   msgid "The server did not accept your user name."
>   msgstr "Le serveur n'a pas accepté votre identité."
> 
>   #, fuzzy
>   msgid "The server did not accept your password."
>   msgstr "Le serveur n'a pas accepté votre identité."

This makes sense and I see how it can be confusing for the translator in some 
cases.
> 
> Finally, if the translator has a feeling that too many fuzzy
> translations have been produced and that it would be better to
> leave out these useless translation proposals, she can use
> option --no-fuzzy-matching.

After I fixed (manually) the incorrect association introduced by msgmerge, the 
rest of the fuzzy matched strings became a huge help for the translator. I'm 
convinced there is value to it. And thanks to your detailed explanation I also 
understand that it may be wise to sometime de-activate fuzzy matching.

Cheers

> Bruno
> 
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

[Prev in Thread]

Current Thread

[Next in Thread]

[bug-gettext] msgmerge confuses unrelated entries, Loic Dachary, 2017/09/14
- Re: [bug-gettext] msgmerge confuses unrelated entries, Bruno Haible, 2017/09/14
  - Re: [bug-gettext] msgmerge confuses unrelated entries, Loic Dachary <=
    - Re: [bug-gettext] msgmerge confuses unrelated entries, Bruno Haible, 2017/09/14
    - Re: [bug-gettext] msgmerge confuses unrelated entries, Loic Dachary, 2017/09/15
- [bug-gettext] msgmerge confuses unrelated entries, Loic Dachary, 2017/09/14

Prev by Date: Re: [bug-gettext] msgmerge confuses unrelated entries
Next by Date: Re: [bug-gettext] msgmerge confuses unrelated entries
Previous by thread: Re: [bug-gettext] msgmerge confuses unrelated entries
Next by thread: Re: [bug-gettext] msgmerge confuses unrelated entries
Index(es):
- Date
- Thread