[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] colorized/highlighted scan output?

From: Ken Hornstein
Subject: Re: [Nmh-workers] colorized/highlighted scan output?
Date: Thu, 01 Nov 2012 22:38:21 -0400

>I was thinking of looking for ANSI sequences and not counting
>them.  But I don't know if that could get into trouble with
>multibyte characters.  mbtowc() is too much of a mystery to me.

Well, this is where things get "funky".

In the particular case of UTF-8, the only magical bytes are ones
with the high bit set.  For bytes < 128, they are handled "normally".
So assuming you're using the "normal" ANSI escape sequences (and
you're not using 0x9b as a CSI), the multibyte routines will ignore

If you care, what we do in fmt_scan with the multibyte routines is this:

- Use mbtowc to convert a possible multibyte character (example: anything
  in UTF-8 U+0080 or greater) into a "wide" character.

- mbtowc() tells us the number of bytes that character consumed.  For ASCII,
  it's always 1.  For UTF-8, sometimes it's > 1.  If we don't have enough
  room in the buffer for a complete character, we stop.

- We use wcwidth() to see how many columns that character consumes, and
  use that to make sure we don't overrun our field width.

- We then copy the bytes over for that character (that we got from mbtowc()).

But it occurs to me that we shouldn't actually do any of this for a "don't
count this" format escape, because that stuff should live outside of
the normal string handling routines in fmt_scan().  Also, I'm with Tom that
I'm not so crazy about putting knowledge of ANSI escape sequences directly
into fmt_scan(), because who knows if your terminal supports them?

David, do you want to implement this?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]