bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Grammatical forms in translatable texts


From: Bruno Haible
Subject: Re: Grammatical forms in translatable texts
Date: Sun, 19 Apr 2020 14:04:27 +0200
User-agent: KMail/5.1.3 (Linux/4.4.0-177-generic; KDE/5.18.0; x86_64; ; )

[CCing bug-gettext.]
The original thread is about this NEWS in bison 3.5.90:

  **** Token aliases internationalization

    When the %define variable parse.error is set to `custom` or `detailed`,
    one may specify which token aliases are to be translated using _().  For
    instance

      %token
          PLUS   "+"
          MINUS  "-"
        <double>
          NUM _("double precision number")
        <symrec*>
          FUN _("function")
          VAR _("variable")

and started at
<https://lists.gnu.org/archive/html/bug-bison/2020-04/msg00011.html>.

> >  [en] msgid "syntax error, unexpected %s"
> >  [de] msgstr "Syntaxfehler, unerwartetes %s"
> >  [fr] msgstr "erreur de syntaxe, %s inattendu"
> > 
> > I know (for de) and think (for fr) that "unerwartetes"/"inattendu"
> > needs to take different forms depending on the gender of %s.
> ...
> > As a complex example, using these token names:
> > 
> >  "Cyrillic letter" -> "kyrillischer Buchstabe"
> >  "Latin letter" -> "lateinischer Buchstabe"
> >  "Greek letter" -> "griechischer Buchstabe"
> > 
> > a correctly translated message in de would look like this:
> > 
> >  "Syntaxfehler, unerwarteter(nom.masc.) kyrillischer(nom.masc.)
> >  Buchstabe(nom.), hatte einen(article/acc.masc.)
> >  lateinischen(acc.masc.) Buchstaben(acc.) oder [einen](same
> >  article, optional) griechischen(acc.masc.) Buchstaben(acc.)
> >  erwartet"
> > 
> > Of course, you might consider this nitpicking.

It is not nitpicking. A msgid "syntax error, unexpected %s", where
a translatable string is plugged in for %s, violates the i18n principle
"Entire sentences", documented at
https://www.gnu.org/software/gettext/manual/html_node/Preparing-Strings.html

> but I believe contexts would help
> (https://www.gnu.org/software/gettext/manual/html_node/Contexts.html).

I don't think contexts can help here. Context are a kind of namespacing
system within a translation domain.

> > I've done something similar in
> > another program of mine where I needed two forms (only two,
> > luckily), and defined a "|" in the translations to separate them
> > (with no "|" meaning the same form for both), e.g.:
> > 
> >  msgid "Cyrillic letter"
> >  msgstr "kyrillischer Buchstabe|kyrillischen Buchstaben"

You can do such things only as long as
  1. the translators are aware of the semantics and syntax of such
     an alternation form,
  2. you have tooling support for it.

In general, I advise against inventing new things that the translators
would have to learn.
<https://lists.gnu.org/archive/html/bug-gettext/2019-03/msg00019.html>

> > Another option would be rather roundabout wordings to make sure the
> > token names always occur in the same case and without article, but
> > these would generally be less readable (and I'm not sure if even
> > possible in every language), something like:
> > 
> >  "syntax error, the token \"%s\" was unexpected, expected one of
> >  the following tokens: %s, ..."
> 
> Well, I have grown up in a word of rather terse err msgs, so I am
> probably biased here.  Again, if there is consensus for something
> different, I'll subscribe to it.

The general solution, that works for any language, is to relax on the
requirement that the error message should be a sentence. It can look
like a form. For example:

   Syntax error.
   Unexpected token: %s
   Expected one of the following tokens: %s, ...

This way it doesn't matter whether the string substituted for %s,
"kyrillischer Buchstabe", is a masculinum or neutrum, and how it would
be declensed in a sentence.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]