[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev UTF-8 display questions (was: Superscripts)

From: Sergei Pokrovsky
Subject: Re: lynx-dev UTF-8 display questions (was: Superscripts)
Date: 11 Jun 2000 14:09:42 +0700
User-agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.6

>>>>> "Klaus" == Klaus Weide <address@hidden> writes:

(answering to my complaint about too early word wrap for UTF-8

  Klaus> You are running into one of the fundamental problems of
  Klaus> displaying UTF-8 with a display library that is not
  Klaus> UTF-8-aware.  The display library (ncurses in your case,
  Klaus> iirc) makes the assumption that one byte == one character
  Klaus> (position).  So it would wrap a line (or possibly truncate
  Klaus> it) after 80 characters in a 80x25 window.

  Klaus> That means that, for lines full of UTF-8 characters in a
  Klaus> range where each character is encoded as two bytes (which
  Klaus> includes Cyrillic), only about half of the available
  Klaus> horizontal display width is usable.

  Klaus> Lynx 2.8.3 is actually improved in this respect: now lynx
  Klaus> takes this into account and breaks the line in an appropriate
  Klaus> place.  That's why you see the line broken between words, and
  Klaus> not broken or truncated in the middle of the third word,
  Klaus> which would be the case in previous versions.

  Klaus> There is one existing workaround, but only if you compile
  Klaus> lynx with slang instead of (n)curses: compile with
  Klaus> SLANG_MBCS_HACK defined.  For example, (this is the way I
  Klaus> pass additional flags to the compile process)

  Klaus>    ./configure --with-screen=slang [...]  make

  Klaus> It works well for me in most $TERMinal types (but not all -
  Klaus> although those aren't UTF-8 capable anyway).

I've reinstalled slang and got normal lines for the multibyte
characters.  BUT at an unacceptable price: when the cursor passes
through an anchor containing multibyte character(s), the line is
spoiled (it is shifted to the left, so that some text before the
multibyte anchor is lost, while the last characters of the line are
duplicated, as the former tail remains on the screen).

Now I've recompiled the text without the SLANG_MBCS_HACK, and got the
previous behavior (halved Cyrillic lines, but stable links; anyway,
-dump produces good line length).


; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]