emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bidirectional text and URLs


From: Lars Magne Ingebrigtsen
Subject: Re: Bidirectional text and URLs
Date: Sun, 30 Nov 2014 19:13:54 +0100
User-agent: Gnus/5.130012 (Ma Gnus v0.12) Emacs/25.0.50 (gnu/linux)

Eli Zaretskii <address@hidden> writes:

> Let's clear up terminology first, OK?

Thanks for the explanation.

> To summarize: Latin characters are displayed left to right, even in
> RTL paragraphs, while right-to-left characters are always displayed
> right to left.  Neutral characters (slash, period) take the direction
> of the surrounding text.

Right.

> HTH

It does, yes.

> May I ask why you came up with the question?

Because I was wondering whether my suggestion from yesterday (that we
insert LRO/PDF characters into URLs if there is an LRO present in the
buffer when recognising URLs) is at all feasible, and from your
explanation, it seems like it would be.

And it would not require reimplementing bidi.c in Lisp.

I agreed with your objection that if we used such a scheme, then the
discussion we're doing here would look pretty incomprehensible.
However, thinking about it a bit more, this is really favouring
meta-discussion over usage, and I think we should be leery of doing
that.

Here's my proposal again, fleshed out with examples, for the algorithms
that recognise (and make buttons out of) URLs and the like in email
(etc.) buffers:

1) If there are no right-to-left overrides in the buffer, then do
nothing special.  This will cover 99.996% of all buffers.

2) If there is an LRO in the buffer, then, after recognising an URL, it
is further treated.

* If it contains no strongly right-to-left characters, we just wrap it
  in an LRO/PDF pair.  URLs like "http://myspace.com"; will then be
  guaranteed to be displayed reading left-to-right.

* If the URL is like http://אבג.דהוזחט.קום, we would segment the URL
  into strongly-left-to-right-with-weak-chars and
  strongly-right-to-left-with-weak-chars segments.  We wrap each
  left-to-right-with-weak-chars in LRO/PDF pairs.

  For that URL, this would be

  LRO http:// PDF אבג.דהוזחט.קום 

Emacs already exposes the weak/strong/LTR/RTL status of each character,
so function to do this LRO/PDF insertion is trivial.  It's like a
seven-line Elisp function or something.

>From what you say, sounds like it would make the display of these URLs
acceptable for bidi readers, too -- this would be the normal display of
these URLs, anyway.  The only thing we're protecting the users from is
shenaningans.

And discussions like this, of course, since all the URLs would display
"correctly".  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



reply via email to

[Prev in Thread] Current Thread [Next in Thread]