emacs-bidi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-bidi] Suboptimal display-reordering in minibuffer


From: Larry Denenberg
Subject: Re: [emacs-bidi] Suboptimal display-reordering in minibuffer
Date: Sun, 27 Jun 2010 22:14:18 -0400

>The issue is whether we can safely force the echo area messages to
>_always_ be rendered with left-to-right paragraph direction.  This is
>what you are suggesting, right?

Well, I made several suggestions, of which this was one.  Let me flesh
it out a little bit.

I think the best solution is this:  Echo buffers and the minibuffer
should permit but not enforce bidirectionality, and anyone who writes to
them must be sensitive to this fact and appropriately careful.  If you
want to echo "X is not defined" as an LTR sentence with X variable, it's
your job to be sure that X doesn't set the direction to something you
didn't intend.

The trouble is that zillions of messages were written without bidi in
mind, and (as we've seen) at least one doesn't do the right thing.  So
what should we do?  Check every message and fix all the offenders?
You first, Indy.  In the meantime, what is a reasonable alternative?

And here I will stand by my suggestion for forcing LTR.  The reasoning
is something like this:  Emacs messages are written in English, which is
LTR.  They may contain arbitrary text, but that text---even if displayed
RTL---is essentially in quotes and can't change the directionality.

You argue that one cannot tell the language from the text:

>But what is the language of a message that includes mixed Hebrew and
>English words or letters?
>
>Emacs allows you to mix several scripts (a.k.a. "languages") in the
>same buffer, so it is no longer clear in what "language" the document
>is written.

In my opinion, you are correct, but this fact is irrelevant to the
problem at hand.  It may well be impossible to figure out the language
of a particular message by examining the text.  But we're not trying to.
We know independently (or are trying to convince ourselves) that these
messages were written in English by English speakers with intended LTR
directionality.

If we accept that Emacs messages are intended as English, that's enough
to say that forcing LTR (to fix problems not foreseen by the writers) is
the right thing.  Insofar as this problem is important enough to solve.

Here's another possibility.  bidi-paragraph-direction is purely an Emacs
thing, right?  It's not in the Unicode bidi standard.  Is it absolute?
That is, can it be overridden by LRM or LRO characters?  If so, we could
put these buffers in bidi mode with LTR default paragraph direction, but
anyone who really wants RTL can still force it.  But I'm increasingly
skeptical that RTL is *ever* the right thing, unless you're writing a
completely new non-English Emacs.  Can you give me an example of any
message in an English Emacs that should be RTL?


>If someone could go over at least some of the myriad of calls to
>`message' in Emacs and see if they all tend to be L2R, I would agree
>that we should by default force L2R paragraph direction on the echo
>area.

This is easy:  Just look at your *Messages* buffer.  Tell me if you see
anything RTL.  Even "Wrote <filename>" with a long Hebrew filename is
LTR.  You ask "if they all tend to be LTR"; I'm fairly convinced that
there isn't even a single one that's RTL.


>Btw, there's something I overlooked before: why exactly is ^ב
>considered a strong R2L character?  Could you please go to it in the "
>*Echo Area 0/1*" buffer, type "C-u C-x =", and show what Emacs tells
>about that character?

First of all, I don't think your procedure works.  You can make the
message appear, and with care you can get a cursor on top of it, but
typing C-u (or most anything else) changes the buffer contents---it's
not called the Echo Area for nothing!  To get your hands on the
character you'd have to write a function that grabs the contents of the
buffer and bind it to a key, or in some other way avoid echoing.

But there's no point in trying.  The buffer can't possibly contain an
actual ^ב.  No buffer can.  Buffers and strings can contain only those
characters encodable in 22 bits.  If your input facilities permit, you
can prove this by typing ^Q ^ב; Emacs refuses to insert such a character
(Wrong type argument: char-or-string-p, 67110353).  From the manual:

     In strings and buffers, the only control characters allowed are
     those that exist in ASCII; but for keyboard input purposes, you can
     turn any character into a control character with `C-'.  The
     character codes for these non-ASCII control characters include the
     2**26 bit as well as the code for the corresponding non-control
     character.  Ordinary terminals have no way of generating non-ASCII
     control characters, but you can generate them straightforwardly
     using X and other window systems.

Here's what I think is happening:  The code that complains about
undefined characters handles uninsertable characters (things like ^ב and
meta-control-mouse-down) by translating them to visible representation.
So the message contains a real caret followed by ב.  That is, the first
character has no strong directionality, and the directionality is set by
the second character, a non-control ב.

/Larry Denenberg
address@hidden
http://larry.denenberg.com/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]