[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: ACS_VLINE characters shifted in UTF-8 xterm
Re: ACS_VLINE characters shifted in UTF-8 xterm
Mon, 6 Apr 2009 17:25:29 -0400 (EDT)
On Mon, 6 Apr 2009, Sebastian Kayser wrote:
Thomas Dickey wrote:
On Sun, Apr 05, 2009 at 01:35:49PM +0200, Sebastian Kayser wrote:
i have an ncurses-based application that draws a vertical line via
ACS_VLINEs to the screen. When i run this application in an UTF-8 xterm
(xterm-243) the ACS_VLINEs come out as a kind of a right-shifted
staircase. When i set NCURSES_NO_UTF8_ACS=1 or when i run the
application in dtterm the vertical line looks fine.
The application in question is mcabber , but i can recreate this
issue with a few lines of code . It doesn't matter whether i link
against a regular ncurses or a --enable-widec one.
I have tried to wrap my head around this and read the relevant ncurses
FAQ items. From what i understand xterm is capable of interpreting the
ACS form characters just fine even in UTF-8 mode (so no need to set
NCURSES_NO_UTF8_ACS). I have truss'ed the application to see how the
vertical line gets written to the terminal.
1B [ H [ s t a t u s ] 0E x0F1B [ m
1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F1B [
m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F1B
[ m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F
1B [ m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x
So it is basically a sequence of "smacs, x, rmacs, sgr0, cud, \b"
sequences to draw the line. When i manually echo ACS_VLINE characters to
the terminal i can see that they consume two columns instead of one and
i suppose that's why the \b isn't sufficient to move completely backwards.
Here's a guess:
First of all, thanks for your response. Indeed, with the -mk_width
option to xterm the vertical line characters only consume one column and
the vertical line shows up properly aligned.
Now i am just trying to understand what is going on here. :) Please
correct me if anything of the following is wrong.
Both ncurses and xterm check the widths of characters using wcwidth.
However, ncurses doesn't check the ACS_xxx widths (lots of things
would break if they're allowed to be something other than 1).
xterm may internally translate the VT100-style line-drawing to Unicode
values (see charsets.c around line 310), and if wcwidth returns an
unexpected value for that, it could produce the sort of effect
So ncurses assumes ACS_VLINE to be 1 column wide and hence emits only a
single \b to move the cursor backwards.
On the other hand, xterm converts ACS_VLINE to 0x2502 ("BOX DRAWINGS
LIGHT VERTICAL") and calls wcwidth() to determine the width of this
character. According to Solaris locale handling it is 2 columns wide,
which is i guess one thing why you say Solaris locales have some quirks,
as this character would perfectly fit into one column?
right (it's always possible to double a line-drawing character,
but difficult to write just half ;-)
So now that xterm has put the vertical line character in two columns,
the single \b from ncurses only goes back one column (\b is always
column-oriented), which is not enough to line up with the vertical line
character in the line above.
I am using Solaris 10 in case that matters, LC_CTYPE is set to
en_US.UTF-8. It seems as if a missing only a small part of the puzzle.
What's wrong with those vertical line characters and my UTF-8 xterm?
Solaris locale support seems to have some quirks (I could digress).
I'd try setting the mkWidth resource, which _should_ tell xterm to
use its built-in locale table.
There's a command-line option (-mk_width).
Finally, with the -mk_width option the xterm uses its internal
mk_wcwidth() to determine the column width and this returns 1 for
0x2502, so now xterm and ncurses go hand in hand WRT to the width of the
vertical line and everything is fine (... dtterm had been fine from the
start because it doesn't translate the ACS_xxx chars into Unicode
Would you say that -mk_width / mkWidth is a reasonable default for xterm
and UTF-8 locales in Solaris? There is a bug entry at Sun Solve 
I can add a check for this to xterm's sanity check (at runtime,
to see if the system's wcwidth is broken or not).
(only accessible to Sun customers :/, bug description at ) that
explains the rationale behind the mkwidth() behaviour and it seems to be
because the _major_ target for localization is Asia and there some font
codepoints in the range of box characters do actually consume two
columns. Not quite sure, whether i got that right there. Please feel
free to digress about Solaris locale support ;)
Collating seems to change with each release - I've seen several comments
about that (and recall researching one, where both before and after, the
results made no sense, but were different).
Solaris locale support got it wrong for the tab character, as well.
(That was one of the reasons I added character-class configurabilty
to vi-like-emacs ;-).
Sun does appear to have the situation where most of the people
working for Sun who are interested in localization are Japanese.
On a related note, i had a look at vttest as well as the UTF-8 demo file
from Markus Kuhn after setting -mk_width. Both look much better now in
xterm, the box alignment tests in particular. Again, thanks for pointing
me in the right direction.
Bug-ncurses mailing list
Thomas E. Dickey