emacs-bidi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-bidi] debugging visual-to-logical


From: Eli Zaretskii
Subject: Re: [emacs-bidi] debugging visual-to-logical
Date: Wed, 21 Nov 2001 19:57:06 +0200

[Seems like my original reply didn't make it, so I'm resending.
Sorry if you get this twice.]

> Date: Wed, 21 Nov 2001 14:51:57 +0330 (IRT)
> From: Behdad Esfahbod <address@hidden>
> 
> > > Date: Tue, 20 Nov 2001 22:01:59 +0330 (IRT)
> > > From: Behdad Esfahbod <address@hidden>
> > > > > 
> > > > > No, its ".3210-555-02 :SI REBMUN YM".
> > > > 
> > > > In what implementation of UAX#9 did you see this, and what type did
> > > > that implementation assign to the upper-case letters and the digits?
> > > > Especially the AN vs EN and AL vs R is important.
> > > 
> > > It was the Reference Implementations output, with A-F as AL and G-Z as 
> > > R, also 0-5 as EN and 6-9 as AN
> > 
> > Didn't you (or someone else) just say that the reference
> > implementation has incorrect definition of the type of `-'?
> 
> The definition is not incorrect, but just differenet from Unicode's 
> types, this is the list of differences:
> 
> \x00-\x1f:
> CapRTL:
>   ON, ON, ON, ON,LTR,RTL, ON, ON, ON, ON, ON, ON, ON, BS,RLO,RLE, /* 00-0f */
>  LRO,LRE,PDF, WS, ON, ON, ON, ON, ON, ON, ON, ON, ON, ON, ON, ON, /* 10-1f */
>   WS, ON, ON, ON, ET, ON, ON, ON, ON, ON, ON, ET, CS, ON, ES, ES, /* 20-2f */
> Unicode:
>   BN ,BN ,BN ,BN ,BN ,BN ,BN ,BN ,BN ,SS ,BS ,SS ,WS ,BS ,BN ,BN ,
>   BN ,BN ,BN ,BN ,BN ,BN ,BN ,BN ,BN ,BN ,BN ,BN ,BS ,BS ,BS ,SS ,
>   WS ,ON ,ON ,ET ,ET ,ET ,ON ,ON ,ON ,ON ,ON ,ET ,CS ,ET ,CS ,ES ,

Now I'm _really_ confused: I thought we were talking about the
differences between the Reference Implementation and the Unicode
database, not between FriBidi and Unicode.  Or are you saying that
FriBidi follows the Reference?

> Char  CapRTL  Unicode
> -----------------------
> 6-9   AN      EN
> @     RTL     ON
> A-F   AL      L
> G-Z   R       L
> \     BS      ON
> `     NSM     ON
> |     SS      ON
> ~     WS      ON
> 
> As you see it's much easier to test types like AN, AL, BS, NSM, SS,
> RLE, LRE, RLO, LRO and PDF with CapRTL rather than Unicode, or your
> variant, then we decided to use this for test porpuses in fribidi.

I agree that some codes must be allocated for character types that
don't exist in the 7-bit ASCII area, but ON is not one of them.  So
why was it a good idea to redefine `-' as ON?  `-' is a very
frequently used character, and one that's involved in quite a few
situations where the UAX#9 algorithm does not-so-wise things.
Redefining the type of `-' is bound to produce lots of confusion, and
complicates the task of judging the results from the user point of
view.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]