bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#24784: 26.0.50; JSON strings with utf-16 escape codes


From: Philipp Stephani
Subject: bug#24784: 26.0.50; JSON strings with utf-16 escape codes
Date: Sat, 31 Dec 2016 16:53:27 +0000



Dmitry Gutov <address@hidden> schrieb am Di., 25. Okt. 2016 um 01:19 Uhr:
Philipp,

Thanks. Some comments:

On 24.10.2016 22:57, Philipp Stephani wrote:

> +(defsubst json--decode-utf-16-surrogates (high low)

IIRC, there might be no actual benefit from making it a defsubst. If
someone could benchmark it, I'd like to see the result.

Agreed; converted to defun. I've only used defsubst because some other helper functions also used defsubst.
 

> +     ;; Special-case UTF-16 surrogate pairs,
> +     ;; cf. https://tools.ietf.org/html/rfc7159#section-7
> +     ((looking-at
> +       (rx (group (any "Dd") (any "89ABab") (= 2 (any "0-9A-Fa-f")))
> +           "\\u" (group (any "Dd") (any "C-Fc-f") (= 2 (any "0-9A-Fa-f")))))
> +      (json-advance 10)
> +      (json--decode-utf-16-surrogates
> +       (string-to-number (match-string 1) 16)
> +       (string-to-number (match-string 2) 16)))

Shouldn't this go below the UTF-8 case, as the less-frequent one?

No, the below case is more general and therefore has to come last.
 

>  (ert-deftest test-json-encode-string ()
>    (should (equal (json-encode-string "foo") "\"foo\""))
>    (should (equal (json-encode-string "a\n\fb") "\"a\\n\\fb\""))
> -  (should (equal (json-encode-string "\nasdфыв\u001f\u007ffgh\t")
> -                 "\"\\nasdфыв\\u001f\u007ffgh\\t\"")))
> +  (should (equal (json-encode-string "\nasdфыв𠄞\u001f\u007ffgh\t")
> +                 "\"\\nasdфыв𠄞\\u001f\u007ffgh\\t\"")))

Why are we testing string encoding here?

It's not 100% related to the patch, but I think it can be included for symmetry reasons (testing encoding as well as decoding). 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]