emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#25366: closed (26.0.50; [:blank:] character class


From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#25366: closed (26.0.50; [:blank:] character class should match all Unicode horizontal whitespace)
Date: Fri, 06 Jan 2017 19:22:02 +0000

Your message dated Fri, 06 Jan 2017 19:21:05 +0000
with message-id <address@hidden>
and subject line Re: bug#25366: 26.0.50; [:blank:] character class should match 
all Unicode horizontal whitespace
has caused the debbugs.gnu.org bug report #25366,
regarding 26.0.50; [:blank:] character class should match all Unicode 
horizontal whitespace
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
25366: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=25366
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace Date: Thu, 05 Jan 2017 14:46:01 +0100
(string-match-p "[[:blank:]]" "\N{HAIR SPACE}")
=> nil, expected 0

[[:blank:]] should be the same as \h in PRCE.


In GNU Emacs 26.0.50.26 (x86_64-unknown-linux-gnu, GTK+ Version 3.10.8)
 of 2017-01-05 built on unknown
Repository revision: d88cdad2847726438c7d1de9fd2651c4be9243aa
Windowing system distributor 'The X.Org Foundation', version 11.0.11501000
System Description:     Ubuntu 14.04 LTS

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Entering debugger...
Back to top level

Configured using:
 'configure --with-modules --enable-checking
 --enable-check-lisp-object-type 'CFLAGS=-ggdb3 -O0''

Configured features:
XPM JPEG TIFF GIF PNG SOUND GSETTINGS NOTIFY GNUTLS FREETYPE XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 MODULES

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message subr-x puny seq byte-opt gv
bytecomp byte-compile cl-extra cconv dired dired-loaddefs format-spec
rfc822 mml mml-sec password-cache epa derived epg epg-config gnus-util
rmail rmail-loaddefs mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
mm-util mail-prsvr mail-utils help-mode easymenu cl-loaddefs pcase
cl-lib debug time-date mule-util tooltip eldoc electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow isearch timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript case-table epa-hook jka-cmpr-hook help
simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs button
faces cus-face macroexp files text-properties overlay sha1 md5 base64
format env code-pages mule custom widget hashtable-print-readable
backquote inotify dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty
make-network-process emacs)

Memory information:
((conses 16 182571 10570)
 (symbols 48 31257 1)
 (miscs 40 340 231)
 (strings 32 71112 6419)
 (string-bytes 1 1678721)
 (vectors 16 14561)
 (vector-slots 8 529555 10250)
 (floats 8 183 150)
 (intervals 56 250 6)
 (buffers 976 13)
 (heap 1024 36602 1391))

-- 
Google Germany GmbH
Erika-Mann-Straße 33
80636 München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle

Diese E-Mail ist vertraulich.  Wenn Sie nicht der richtige Adressat sind,
leiten Sie diese bitte nicht weiter, informieren Sie den Absender und löschen
Sie die E-Mail und alle Anhänge.  Vielen Dank.

This e-mail is confidential.  If you are not the right addressee please do not
forward it, please inform the sender, and please erase this e-mail including
any attachments.  Thanks.



--- End Message ---
--- Begin Message --- Subject: Re: bug#25366: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace Date: Fri, 06 Jan 2017 19:21:05 +0000


Philipp Stephani <address@hidden> schrieb am Fr., 6. Jan. 2017 um 20:10 Uhr:
Eli Zaretskii <address@hidden> schrieb am Fr., 6. Jan. 2017 um 16:11 Uhr:
> From: Philipp Stephani <address@hidden>
> Date: Fri, 06 Jan 2017 15:00:22 +0000
> Cc: address@hidden
>
http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properties
>
>  Patches to that effect are welcome.
>
> Here's a patch.

Thanks.  A few minor comments below.

> +/* Return true if C is a horizontal whitespace character, as defined
> +   by http://www.unicode.org/reports/tr18/tr18-19.html#blank.  */
> +bool
> +blankp (int c)
> +{
> +  if (c == '\t')
> +    return true;

Why does this test explicitly only for a TAB?  What about SPC, for
example?

Because TAB is the only character that is blank, but doesn't have the general category Zs.
I've now also included space and added a comment. The risk that the general category of space will ever be changed seems very small.
 

> --- a/doc/lispref/searching.texi
> +++ b/doc/lispref/searching.texi
> @@ -553,7 +553,10 @@ Char Classes
>  (@pxref{Character Properties}) indicates they are alphabetic
>  characters.
>  @item [:blank:]
> -This matches space and tab only.
> +This matches horizontal whitespace, as defined by Unicode Technical
> +Standard #18.  In particular, it matches tabs and characters whose
> +Unicode @samp{general-category} property (@pxref{Character
> +Properties}) indicates they are spacing separators.

Similarly here: I find the lack of reference to a space potentially
confusing.

Added.
 

> +** The regular _expression_ character class [:blank:] now matches
> +Unicode horizontal whitespace as defined in
> +http://www.unicode.org/reports/tr18/tr18-19.html#blank.

The reference to a particular version of UTS#18 might become obsolete
when a new version is released.  So I suggest to provide a general
reference to the report and its section, not an exact URL.

Done. 


Pushed to master as 512e9886be. 

--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]