[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: ASCII-folded search [was: Re: Upcoming loss of usability ...]
From: |
Eli Zaretskii |
Subject: |
Re: ASCII-folded search [was: Re: Upcoming loss of usability ...] |
Date: |
Thu, 18 Jun 2015 08:27:03 +0300 |
> From: "Stephen J. Turnbull" <address@hidden>
> Date: Thu, 18 Jun 2015 13:52:49 +0900
> Cc: address@hidden
>
> Marcin Borkowski writes:
>
> > On the other hand, it would be great if we had an "ascii-folding"
> > option, making (some reasonable subset of) Unicode "equivalent" to
> > ASCII,
>
> I believe Emacs already implements NFD normalization.
Yes, see ucs-normalize-NFD-region and friends.
> All you need after that is to skip compose characters when
> searching.
No, it's much more complex than that. For starters, normalization
won't convert u+2018 etc. to their ASCII counterparts. The Unicode
Standard doesn't consider those even compatibility-equivalent. And
for matching just the base characters (which is what I presume is
meant here by "ascii-folding"), we'd need to handle correctly any
number of combinations of pre-composed and decomposed character
sequences in both the search string and the text we search, and
implement that on the fly, since the buffer text obviously cannot be
transformed for these purposes.
So yes, this feature is something that's sorely needed, but volunteers
need to know that the task is not too easy (or else it would have been
done long ago). Interested individuals can start by studying the
following references:
. Sections 5.18 "Case Mappings" and 5.19 "Mapping Compatibility
Variants" of the Unicode Standard
. UTN#5 "Canonical Equivalence in Applications"
(http://www.unicode.org/notes/tn5/)
. UTR#15 "Unicode Normalization Forms"
(http://unicode.org/reports/tr15/)
- Re: Upcoming loss of usability of Emacs source files and Emacs., (continued)
- Re: Upcoming loss of usability of Emacs source files and Emacs., Richard Stallman, 2015/06/17
- Re: Upcoming loss of usability of Emacs source files and Emacs., Nicolas Petton, 2015/06/17
- RE: Upcoming loss of usability of Emacs source files and Emacs., Drew Adams, 2015/06/17
- Re: Upcoming loss of usability of Emacs source files and Emacs., Marcin Borkowski, 2015/06/17
- Re: Upcoming loss of usability of Emacs source files and Emacs., Eli Zaretskii, 2015/06/17
- Re: Upcoming loss of usability of Emacs source files and Emacs., Steinar Bang, 2015/06/17
- Re: Upcoming loss of usability of Emacs source files and Emacs., Stefan Monnier, 2015/06/18
- ASCII-folded search [was: Re: Upcoming loss of usability ...], Stephen J. Turnbull, 2015/06/18
- Re: ASCII-folded search [was: Re: Upcoming loss of usability ...],
Eli Zaretskii <=
- Re: ASCII-folded search [was: Re: Upcoming loss of usability ...], Stephen J. Turnbull, 2015/06/18
- Re: ASCII-folded search [was: Re: Upcoming loss of usability ...], Eli Zaretskii, 2015/06/18
- Re: ASCII-folded search [was: Re: Upcoming loss of usability ...], Artur Malabarba, 2015/06/18
- Re: ASCII-folded search [was: Re: Upcoming loss of usability ...], Eli Zaretskii, 2015/06/18
- Re: ASCII-folded search [was: Re: Upcoming loss of usability ...], Artur Malabarba, 2015/06/18
- Re: ASCII-folded search [was: Re: Upcoming loss of usability ...], Eli Zaretskii, 2015/06/18
- Re: ASCII-folded search [was: Re: Upcoming loss of usability ...], Artur Malabarba, 2015/06/22
- Re: ASCII-folded search [was: Re: Upcoming loss of usability ...], Artur Malabarba, 2015/06/22
- Re: ASCII-folded search [was: Re: Upcoming loss of usability ...], Artur Malabarba, 2015/06/22
- Re: ASCII-folded search [was: Re: Upcoming loss of usability ...], Juri Linkov, 2015/06/22