bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #62677] Uniquness of source identifier not clear


From: Gábor Hojtsy
Subject: [bug #62677] Uniquness of source identifier not clear
Date: Mon, 27 Jun 2022 11:15:50 -0400 (EDT)

URL:
  <https://savannah.gnu.org/bugs/?62677>

                 Summary: Uniquness of source identifier not clear
                 Project: GNU gettext
               Submitter: gaborhojtsy
               Submitted: Mon 27 Jun 2022 03:15:48 PM UTC
                Category: Doc
                Severity: 3 - Normal
              Item Group: None
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any


    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: Mon 27 Jun 2022 03:15:48 PM UTC By: Gábor Hojtsy <gaborhojtsy>
Drupal is a GPL licensed content management system / web platform. I am one of
the core committers. 

Drupal added import/export support for .po files in 2004 with Drupal 4.5.0.

Unfortunately our expectation of source uniquness for the past 18 years
appears to come back at us. Unfortunately
https://www.gnu.org/software/gettext/manual/gettext.html does not define what
should be considered a source exactly. There are these two potential ways to
define sources, examples copied directly from the docs:


#: lib/error.c:116
msgid "Unknown system error"
msgstr "Error desconegut del sistema"


and


#, c-format
msgid "One file removed"
msgid_plural "%d files removed"
msgstr[0] ""
msgstr[1] ""


While not a recent change, it has come to my attention recently that assuming
the above two examples, the following two are invalid (not unique):


# This will be invalid due to the standalone msgid from above.
msgid "Unknown system error"
msgid_plural "%d unknown system errors"
msgstr[0] ""
msgstr[1] ""


and


# This will be invalid due to the plural version from above.
msgid "One file removed"
msgstr ""


This is due to the identical msgid used. Unfortunately for Drupal this does
not match our implemented understanding of the .po format, where we understood
that the msgid_plural part of a singular/plural definition is not ignored for
uniqueness but rather considered part of the source identifier (its name is
msgid_... after all). 

The docs on translating plural forms explicitly explains that the msgid in the
plural case cannot be interpreted without the msgid_plural and vice versa,
plus that they cannot be interpreted without a cardinal number:

> Such an entry denotes a message with plural forms, that is, a message where
the text *depends on a cardinal number*. The general form of the message, in
English, is the msgid_plural line. The msgid line is the English singular
form, that is, the form for when the number is equal to 1.

So as documented where an msgid and msgid_plural combination is defined, the
translation cannot be interpreted for a general (independent of cardinal
number) case and independent of the msgid_plural.

So why are gettext tools consider the msgid as a standalone identifier
regardless of the msgid_plural alongside it then? 

Regardless of which is the correct way, I think it would be important to
update the documentation. I would be happy to suggest updates to the
documentation to make that clear. (After that Drupal may need to go through a
considerable transition to update our understanding in multiple major
versions).







    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?62677>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]