Re: Wide and UTF-8 international characters

bug-ncurses

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Wide and UTF-8 international characters

From:	D. Stimits
Subject:	Re: Wide and UTF-8 international characters
Date:	Fri, 16 May 2003 17:06:51 -0600
User-agent:	Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2b) Gecko/20021018

Thomas Dickey wrote:

On Fri, May 09, 2003 at 09:23:43AM +0000, John Smith wrote:

>Is there a summary of how to use ncurses with international (wide andutf-8)



there are no tutorials that I'm aware of
(a few manpage references don't count).

I've been adding to the test programs in ncurses to demonstrate these
and other functions.  test/view.c and test/ncurses.c for example.

>character sets? I can't figure out the right way to do it. Apparentlyvim

>does it, so it should be possible.


actually vim doesn't (it uses termcap-level functions to draw text, and
uses the same wide-character/multibyte string functions that ncurses uses
to manipulate the data).

This is something I'm becoming curious about (I have yet to experimentwith it in ncurses, this is all theory for me so far). I ran ldd (linux)on vim, and it shows that it links with libncurses, and not libtermcapor other term libs. It must be doing any non-7-bit-ascii character viancurses (though I haven't looked to see what the non-7-bit-ascii lookslike in vim).

In reading a small book (booklet?) on the original curses (not ncurses),it says the upper bit on 8 bit characters is used to mark standout mode.If I am using just a console or or xterm, without ncurses, I can outputthe full 8 bit characters as described in html 8-bit entities, echoeddirectly to a console (not with ncurses or any lib), such as "©",and get the copyright symbol that is like a 'c' inside of a circle (ithappens that to echo this I echo an uninterpreted 169 decimal, typecastto char). So current terminals, whether console or X11, use the full 8bits to create their display. If the eighth bit is being used by curses,then the top 128 characters are lost to standout mode ability. On theother hand, if ncurses uses a separate byte (a 16 bits) to storecharacteristics, while leaving the full 8 bits to display output, thenncurses can display the full 255 character entity set (html entity set)simply by sending the character straight to the terminal. I'm notpositive, but this should include the full UTF-8 set, which is onlysingle-byte. Is ncurses storing attribute in a separate byte already? Oris it the way of the old book description, with 7 bits for character,and the last bit for standout mode flagging? If a separate byte is usedalready, then it would seem that multibyte characters already have the"infrastructure" to be plugged into ncurses. [FYI, it would be ratheruseful to see an entity substitution ability, like "©" in html]

Pardon my curiosity, lately I've been looking at some non-7-bit asciiclients, but the clients support only 8 bit, not multibyte characters. Icreated a lightweight XML style data tree storage mechanism that usesXML/html entities to represent characters that cannot be easily enteredvia a keyboard, and it turned out to be far more flexible/useful than Ithought at first. I remember seeing some of the development ncursesbranch as partial or initial support for the wide characters, and Iwonder if separation of attributes (like the 8th bit in traditional/oldcurses for standout) has been part of this preparation?


D. Stimits, stimits AT attbi DOT com

[Prev in Thread]

Current Thread

[Next in Thread]

Wide and UTF-8 international characters, John Smith, 2003/05/09
- Re: Wide and UTF-8 international characters, Thomas Dickey, 2003/05/09
  - Re: Wide and UTF-8 international characters, D. Stimits <=
    - Re: Wide and UTF-8 international characters, Thomas Dickey, 2003/05/16
    - Re: Wide and UTF-8 international characters, D. Stimits, 2003/05/17
    - Re: Wide and UTF-8 international characters, Thomas Dickey, 2003/05/17

Prev by Date: Re: Background color detection
Next by Date: Re: Wide and UTF-8 international characters
Previous by thread: Re: Wide and UTF-8 international characters
Next by thread: Re: Wide and UTF-8 international characters
Index(es):
- Date
- Thread