ddd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: unicode strings


From: Andrew Gaylard
Subject: Re: unicode strings
Date: Tue, 31 May 2005 09:27:15 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031114

Gadi Bergman wrote:
I was thinking about two different solutions:

1. Very trivial solution for taking care of Western European languages
only:
Arrays of 'unsigned long' and 'unsigned short' will be presentable to
the user as strings by writing each element that is lower than 0x100 as
a char with that value and other elements as '?'. For this solution, DDD
assumes Western European single-byte character encoding of the string.

This is quite easy to do, especially if one assumes ISO-8859-1.

2. Very complex solution for taking care of all languages:
Arrays of 'unsigned long' and 'unsigned short' will be presentable to
the user as strings by drawing each character using a Unicode font on
the DDD display section. For this solution, DDD does not assume any
multi-byte or single-byte encoding and may display a string composed of
any number of languages.

This proposal doesn't make sense without an encoding.  Remember that
the entire Unicode set won't fit into 16 bits, hence the need for
variable-length characters, as seen in UTF8.  So an encoding is
required to know *which* subset of Unicode a short (i.e. 16-bit)
array entry should index.

I do not recommend adding a dialog box for asking the user to specify an
encoding, certainly not for every string the user wishes to display.

Yes, certainly one wouldn't like to have to choose an encoding every
time a string is viewed.  I foresee a scheme where DDD uses $LANG,
unless it is overridden by the user.  All subsequent string
viewing/displaying would then be subject to the specified enconding.

These proposals both depend on the X-server having the appropiate
fonts available, either locally or via a fontserver (which isn't too
hard to do and often works "out-of-the-box" nowadays).  They also
depend on Motif *actually* *using* the correct language as specified by
$LANG, something which is probably very simple but which has eluded
me thus far.

Any pointers on this aspect would be *VERY* much appreciated!  I'm
tearing my hair out...

Andrew

-----Original Message-----
From: Andrew Gaylard [mailto:address@hidden Sent: Monday, May 30, 2005 3:09 PM
To: Gadi Bergman
Cc: address@hidden
Subject: Re: unicode strings

Gadi Bergman wrote:

Hello Andrew and thank you for your very quick response,

I do not expect the protocol for X to be Unicode-based. However,
programs written in Java do display their characters correctly for all
languages when running under X. I assume that they are drawing their

own

characters with built-in fonts. That would be hard to expect from
text-based GDB, but since DDD is a graphic application then it is much
more natural.

Thanks again,
Gadi.


OK, then what's needed is a dialog box for DDD to specify the encoding,
and a button in "examine memory" which will unpack the bytes returned by
gdb in the encoding specified.  Yeah, that would do the trick.

Feel like contributing a patch?








reply via email to

[Prev in Thread] Current Thread [Next in Thread]