emacs-bidi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-bidi] bidi tables for unicode chars


From: Alex Schroeder
Subject: Re: [emacs-bidi] bidi tables for unicode chars
Date: Sat, 10 Nov 2001 21:05:58 +0100
User-agent: Gnus/5.090004 (Oort Gnus v0.04) Emacs/21.1 (i686-pc-linux-gnu)

Roozbeh Pournader <address@hidden> writes:

> BTW, would you please consider looking at:
> 
> <http://www.unicode.org/unicode/reports/tr9/#Bidirectional_Character_Types>
> 
> Which assigns character types for unassigned characters? In short, these
> are the classes for all the unassigned characters in the mentioned ranges:
> 
> 0590--05FF, FB1D--FB4F                   R
> 0600--07BF, FB50--FDFF, FE70--FEFF       AL
> All others                               L

Is this important to you?  I know that such ranges are defined in the
report:

|        |L   |Left-to-Right|LRM, Most alphabetic, syllabic, Han ideographic   |
|        |    |             |characters, digits that are neither European nor  |
|        |    |             |Arabic, all unassigned characters except in the   |
|        |    |             |ranges (0590-05FF, FB1D-FB4F) and (0600-07BF,     |
|        |    |             |FB50-FDFF, FE70-FEFF).                            |

|Strong  |R   |Right-to-Left|RLM, Hebrew alphabet, most punctuation specific to|
|        |    |             |that script, all unassigned characters in the     |
|        |    |             |ranges (0590-05FF, FB1D-FB4F)                     |

However, I think that using UnicodeData.txt as a basis for the table
will be the correct thing in all situations.  When the characters get
assigned, that file will be updated.  And thus I can just parse it
again when necessary and use the new table for my code.  I don't think
it makes sense to manually edit the table at the moment.  Especiall
considering that the stuff may change in the future:

Note:  Unassigned characters are given strong types in the algorithm. This is an
       explicit exception to the general Unicode conformance requirements with  
       respect to unassigned characters. As characters become assigned in the   
       future, these bidirectional types may change.                            

Note that for some of these characters the UnicodeData.txt file did
provide information (not consistent with TR9):

         (?\x0586 . bidi-category-l) ;; 
         (?\x0587 . bidi-category-l) ;; 
         (?\x0589 . bidi-category-l) ;; ARMENIAN PERIOD
         (?\x058A . bidi-category-on) ;; 
         (?\x0591 . bidi-category-nsm) ;; 
         (?\x0592 . bidi-category-nsm) ;; 
         (?\x0593 . bidi-category-nsm) ;; 
         (?\x0594 . bidi-category-nsm) ;; 
         (?\x0595 . bidi-category-nsm) ;; 

Alex.
-- 
http://www.emacswiki.org/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]