bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: console update


From: Marcus Brinkmann
Subject: Re: console update
Date: Mon, 17 Jun 2002 17:16:54 -0400
User-agent: Mutt/1.3.25i

On Mon, Jun 17, 2002 at 10:02:06AM -0700, Thomas Bushnell, BSG wrote:
> But I think we should hope for actual Unicode compliance.  Obviously I
> don't insist we have it now.  But I object to an architectural
> decision that will make it harder...

We will have unicode normalization form C and unicode support level 1.
This basically means: No compose characters, no double-width characters.
No bidirectional writing either.

The architectural decision we are making right now is how the
communication protocol between the console server and the client is in
this first version.  And because we all agree that a screen matrix of
the character cell is the right thing, this is what we use right now.

The question is, what information should this matrix cells hold?
This is what we have now:

Bytes relative to &_matrix[0], type, meaning:
0-3     wchar_t         Unicode character in row 0, col 0.
4-7     conchar_attr_t  Video attributes for character in row 0, col 0.
8-11    wchar_t         Unicode character in row 0, col 1.
12-15   conchar_attr_t  Video attributes for character in row 0, col 1.
...

Note that the cells include the scrollback buffer, and that the actual
visible area can start in any row in the matrix.

This obviously limits us to Unicode support level 1.  Increasing the
number of wchar_t's for each cell would allow us to support compose
characters at least in the console server (passing them through to the
client).  However, this increases the memory footprint enormously.
For example, if you have a scrollback buffer of 500 lines, you have
500*80 = 40000 character cells.  If you support what ncurses supports,
that is up to five characters per cell, you have 24 bytes per cell
(5*sizeof(wchar_t) + sizeof(conschar_attr_t).  This is 937kb of data!
With 10 virtual consoles, this adds up to 9 MB of data, most of which is
unused.  As the common case is that only three bytes of the 24 are
actually relevant (two attribute bytes and one character byte) for
isolat1 characters, you have 21 wasted bytes per cell (an efficiency
of 14 percent), and what is worse, they are all interleaved with the
real data, so you touch all pages.

> I think it should send actual Unicode; and the lower-level is
> responsible for composing the character.

We do send unicode, but we don't send anything that can be composed, as
composing is not part of unicode support level 1 with NFC.  I think it
is a good idea to see what other folks who seriously work in this area
(usually for the Linux kernel) cook up before putting work into more unicode
support.  When we get unicode support level 1 and NFC, we will have
made a jump ahead that applications first need to reach anyway (not all
applications are ready yet for UTF-8).

Thanks,
Marcus




reply via email to

[Prev in Thread] Current Thread [Next in Thread]