emacs-bidi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-bidi] Suboptimal display-reordering in minibuffer


From: Amit Aronovitch
Subject: Re: [emacs-bidi] Suboptimal display-reordering in minibuffer
Date: Mon, 28 Jun 2010 09:14:36 +0300



On Mon, Jun 28, 2010 at 5:14 AM, Larry Denenberg <address@hidden> wrote:

>The issue is whether we can safely force the echo area messages to
>_always_ be rendered with left-to-right paragraph direction.  This is
>what you are suggesting, right?

Well, I made several suggestions, of which this was one.  Let me flesh
it out a little bit.

I think the best solution is this:  Echo buffers and the minibuffer
should permit but not enforce bidirectionality, and anyone who writes to
them must be sensitive to this fact and appropriately careful.  If you
want to echo "X is not defined" as an LTR sentence with X variable, it's
your job to be sure that X doesn't set the direction to something you
didn't intend.

The trouble is that zillions of messages were written without bidi in
mind, and (as we've seen) at least one doesn't do the right thing.  So
what should we do?  Check every message and fix all the offenders?
You first, Indy.  In the meantime, what is a reasonable alternative?

And here I will stand by my suggestion for forcing LTR.  The reasoning
is something like this:  Emacs messages are written in English, which is
LTR.  They may contain arbitrary text, but that text---even if displayed
RTL---is essentially in quotes and can't change the directionality.


My own suggestion was to set the direction according to the language in LC_MESSAGES ("system-messages-locale" in emacs) if the proper translation is installed, LTR otherwise. However, a quick search seems to indicate that at the moment there is no i18n for emacs at all (there is a new i18n project, but I did not find any code there: http://savannah.nongnu.org/projects/emacs-i18n ). 
Given the above, my suggestion becomes identical to yours: set directionality to LTR in system messages, possibly leaving an option for the user to force to RTL in specific messages.

[snipped some arguments, to which I fully agree]


>Btw, there's something I overlooked before: why exactly is ^ב
>considered a strong R2L character?  Could you please go to it in the "
>*Echo Area 0/1*" buffer, type "C-u C-x =", and show what Emacs tells
>about that character?

First of all, I don't think your procedure works.  You can make the
message appear, and with care you can get a cursor on top of it, but
typing C-u (or most anything else) changes the buffer contents---it's
not called the Echo Area for nothing!  To get your hands on the
character you'd have to write a function that grabs the contents of the
buffer and bind it to a key, or in some other way avoid echoing.


How exactly did you get the ^x? 
The echo messages that I see here look like C-x.
Also, I was able to get it in the minibuffer by using the interactive global-set-key command. Seems like what was inserted in the buffer was actually "C","-","א". 

But there's no point in trying.  The buffer can't possibly contain an
actual ^ב.  No buffer can.  Buffers and strings can contain only those
characters encodable in 22 bits.  If your input facilities permit, you
can prove this by typing ^Q ^ב; Emacs refuses to insert such a character
(Wrong type argument: char-or-string-p, 67110353).  From the manual:

    In strings and buffers, the only control characters allowed are
    those that exist in ASCII; but for keyboard input purposes, you can
    turn any character into a control character with `C-'.  The
    character codes for these non-ASCII control characters include the
    2**26 bit as well as the code for the corresponding non-control
    character.  Ordinary terminals have no way of generating non-ASCII
    control characters, but you can generate them straightforwardly
    using X and other window systems.

Here's what I think is happening:  The code that complains about
undefined characters handles uninsertable characters (things like ^ב and
meta-control-mouse-down) by translating them to visible representation.
So the message contains a real caret followed by ב.  That is, the first
character has no strong directionality, and the directionality is set by
the second character, a non-control ב.

But then the reordering algorithm would have rendered it ב^ and not ^ב as you saw. Probably the truth is in between - maybe the handling of uninsertables is done AFTER reordering, so from the POV of the reordering algorithm it is considered a single character as Eli said.

   AA


reply via email to

[Prev in Thread] Current Thread [Next in Thread]