emacs-bidi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-bidi] Display routines


From: Eli Zaretskii
Subject: Re: [emacs-bidi] Display routines
Date: Tue, 06 Nov 2001 18:54:18 +0200

> From: Matan Ninio <address@hidden>
> Date: 06 Nov 2001 12:24:55 +0200
> 
> here are some I can think of:

Thanks!

Let me comment on some, or ask questions where I'm not sure I
understand.

> -> input and output for iso 8859-8 and 8859-8-i (visual and logical order
>    Hebrew in the "European languages"/non Unicode standard. (and similarly
>    8859-6 for Arabic).

This is not clear: what exactly is the issue to take care of?

I understand that visual-order text should be reordered into logical,
but what do we need to do with 8859-8-i, which is already in logical
order?  The normal code conversions (from 8859-8 to the internal
representation) already work in today's Emacs.  What else is left?

> (what is cp-1255 anyway?)

It's identical to 8859-8, except that the codes in the range 128..159
have characters that are outside iso8859-8 set (a.k.a. MS extensions).

> -> printing of Hebrew files via ps-print or even print-buffer

This should be taken care of by the code I wrote (see my other mail
today); the only thing that should be done in lpr.el and ps-print.el
is the part that justifies RTL paragraphs to the right margin of the
paper and blank-fills whatever is left to the left margin.

> -> auto detect of 8859-8 vise 8859-8-i
> should probably be based on the frequency of pairs of letters.  I can
> give you an algorithm that, given some body of texts, can give a
> probability for each direction.

Please post this algorithm.

> -> input keymaps and such for Hebrew input
> including the two-way-cursor and the language swap (input and
> paragraph's main direction)

This part should be in the Leim's hebrew.el, which already exists
(but, of course, doesn't handle RTL).  You should be able already to
say "C-u C-\ hebrew RET" (or just "C-\", if your language environment
is set to "Hebrew"), and have your keyboard send Hebrew characters.
If your keyboard sends iso8859-8 codes, "C-x RET k hebrew RET" should
have a similar effect.  So much for the keymaps and the language swap.
(This should be augmented by key sequences to insert bidi formatting
codes, but that's a minor change.)

As for the paragraph main direction, after lots of thinking, I
propose the following model:

  - If the user wants to force the paragraph to be of a certain
    directionality, she should type the LRM or RLM code as the first
    character of the paragraph.

  - Otherwise, Emacs will decide the paragraph direction by looking
    at the first strong character: if it's R, the direction will be
    RTL, otherwise it will be LTR.

This is essentially what UAX#9 says, and it looks to me that there's
no reason to invent something else here.  Especially since Emacs
doesn't have a notion of a paragraph at the C level, and I don't feel
like introducing it, if I can avoid that ;-)

Comments?

> -> visual2logical-region in case you pasted something in the wrong
>    direction 

Do you envision a possibility of buffers with visual order?  If so,
could you tell why do you think such buffers are required?

If we don't have visual-order buffers, the probability of bumping
into this situation seems very remote.

> -> an algorithm for some support for "rectangle" in bidi texts
> (may be a pain, that last one)

There has been some discussion on the Emacs development list (or was
it bug-gnu-emacs? I forget which one) about the need for
non-contiguous regions.  If something like that gets added to Emacs,
we could use it for bidi.

> -> W3 mode ???

Not sure what you mean: is it HTML bidi directives?

> that's it for now, but there probably are many others (Word->displayable
> form??)

You mean, to read Word .doc files?  What's the bidi-specific aspects
of that?  MS-Word implements UAX#9 (in fact, MS are behind most of
the mess in the bidi algorithm in its present form ;-), so, after
conversion from their proprietary format, what you have is
logical-order text.  Did I miss something?

Thanks again for your ideas.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]