bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Run-time internationalized messages


From: Bruce Lilly
Subject: Re: Run-time internationalized messages
Date: Sat, 03 May 2003 12:07:56 -0400
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030312

Hans Aberg wrote:

If one uses multiple parser algorithms/computer output languages, it
becomes nuisance of having to coordinate them all, unless one rationalizes
such things as putting the strings in one special file.

Good point. I've only used bison to produce C parsers, so I have no
idea whether the strings are similar for other output languages.

Now you evidently want a dynamic approach. One approach might be to put all
the default strings in character arrays, which easily can be changed at
runtime, if the names of the strings are known. If the strings are already
in M4 macros, the only thing that would be needed is a special M4 skeleton
file.

For C, one approach for the output file is to simply use one array of
strings, which can be accessed by an index computed from a basic
message index and a language index. Equivalently, it could be viewed
as a 2-D array[msg][lang].

As for the question of making the thing platform independent, there is no
such a thing with respect to output languages like C/C++. So there you are
left out in the cold. When I discussed it in a C++ newsgroup, the best
thing that people really needing this feature (as those writing WWW
browsers/servers and such) currently could find was to give names to each
character according to some encoding, and then use that. For example, using
Unicode:
   unsigned LATIN_CAPITAL_LETTER_A = 0x0041;
   ...
or
   #define LATIN_CAPITAL_LETTER_A 0x0041
   ...
Then use LATIN_CAPITAL_LETTER_A instead of "A". One can probably easily
produce such list of characters by taking down the Unicode Namelist and
convert to C format via a suitable small program.

I didn't have anything quite so elaborate in mind.  I would imagine that
each language would have an associated charset (e.g. us-ascii, iso-8859-x,
utf-8).  What I did intend was that the implementation shouldn't depend
on pulling the strings out of an external file at run time, since some
target platforms running a parser might not have a file system as such
(think embedded systems, cell phones, etc.).

Aside from dealing with the output programming language issue, I can
imagine a few others:

2. API for language switching

3. Where the language-switching code goes -- in each generated parser
   file, or in a library archive a la liby.a.

4. How the parser keeps track of the desired language, which will have
   to work for pure parsers as well as for non-reentrant ones.

5. Actually integrating it into the bison build process, automake, autoconf,
   etc.

#2, #3, and #4 are related. E.g. if the language-switching code goes
in each parser (presumably because the implementation can't be handled
in library code), then it should probably be affected by a prefix
change (yy -> some prefix).







reply via email to

[Prev in Thread] Current Thread [Next in Thread]