[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE
From: |
Eli Zaretskii |
Subject: |
bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE |
Date: |
Mon, 06 Apr 2020 17:21:34 +0300 |
> From: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
> Cc: Mattias Engdegård <mattiase@acm.org>,
> 40407@debbugs.gnu.org
> Date: Mon, 06 Apr 2020 19:10:48 +0900
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> >> - if (BUFFERP (dst_object))
> >> + if (EQ (dst_object, Qt))
> >> + {
> >> + /* Fast path for ASCII-only input and an ASCII-compatible coding:
> >> + act as identity. */
> >> + Lisp_Object attrs = CODING_ID_ATTRS (coding.id);
> >> + if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs))
> >> + && (STRING_MULTIBYTE (string)
> >> + ? (chars == bytes) : string_ascii_p (string)))
> >> + return string;
>
> While using the latest master branch, I noticed this became the cause of
> decoding error.
>
> The simple reproducible test is,
>
> (decode-coding-string "&abc" 'utf-7-imap)
> => "&abc"
>
> like the above result, decoding utf-7-imap didn't work.
>
> Because (coding-system-get 'utf-7-imap :ascii-compatible-p) => t.
Thanks.
> I'm not sure, 'utf-7* should be fixed as non ascii-compatible, or
> string_ascii_p() should check more strictly.
The former, since UTF-7 is definitely *not* ASCII-compatible. Does
the patch below produce good results?
Kenichi, why was coding-type of UTF-7 systems set to 'utf-8'?
Wouldn't it be better to set it to 'utf-16'? Or is there some
subtlety here that we should be aware of? Do you have any comments on
the patch below?
Thanks.
diff --git a/src/coding.c b/src/coding.c
index 97a6eb9..71ff93c 100644
--- a/src/coding.c
+++ b/src/coding.c
@@ -11301,7 +11301,10 @@ DEFUN ("define-coding-system-internal",
Fdefine_coding_system_internal,
CHECK_CODING_SYSTEM (val);
}
ASET (attrs, coding_attr_utf_bom, bom);
- if (NILP (bom))
+ if (NILP (bom)
+ /* UTF-7 has :coding-type set to 'utf-8' (why not
+ 'utf-16'?), but it is definitely NOT ASCII-compatible. */
+ && !EQ (name, Qutf_7) && !EQ (name, Qutf_7_imap))
ASET (attrs, coding_attr_ascii_compat, Qt);
category = (CONSP (bom) ? coding_category_utf_8_auto
@@ -11673,6 +11676,9 @@ syms_of_coding (void)
DEFSYM (Qutf_8_unix, "utf-8-unix");
DEFSYM (Qutf_8_emacs, "utf-8-emacs");
+ DEFSYM (Qutf_7, "utf-7");
+ DEFSYM (Qutf_7_imap, "utf-7-imap");
+
#if defined (WINDOWSNT) || defined (CYGWIN)
/* No, not utf-16-le: that one has a BOM. */
DEFSYM (Qutf_16le, "utf-16le");
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, (continued)
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/04
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/05
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/05
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/05
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/05
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/05
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/06
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/05
bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, OGAWA Hirofumi, 2020/04/06
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE,
Eli Zaretskii <=
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/06
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/06
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/06
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/06
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/06
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Eli Zaretskii, 2020/04/06
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, OGAWA Hirofumi, 2020/04/06
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/06
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Mattias Engdegård, 2020/04/09
- bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE, Kazuhiro Ito, 2020/04/09