[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: master e39cb515a10 1/4: Correctly handle non-BMP characters in Andro
From: |
Eli Zaretskii |
Subject: |
Re: master e39cb515a10 1/4: Correctly handle non-BMP characters in Android content file names |
Date: |
Sat, 23 Mar 2024 12:24:03 +0200 |
> diff --git a/lisp/term/android-win.el b/lisp/term/android-win.el
> index 8d262e5da98..6512ef81ff7 100644
> --- a/lisp/term/android-win.el
> +++ b/lisp/term/android-win.el
> @@ -529,5 +529,94 @@ accessible to other programs."
> (android-browse-url-internal url send))
>
>
> +;; Coding systems used by androidvfs.c.
> +
> +(define-ccl-program android-encode-jni
> + `(2 ((loop
> + (read r0)
> + (if (r0 < #x1) ; 0x0 is encoded specially in JNI environments.
> + ((write #xc0)
> + (write #x80))
> + ((if (r0 < #x80) ; ASCII
> + ((write r0))
> + (if (r0 < #x800) ; \u0080 - \u07ff
> + ((write ((r0 >> 6) | #xC0))
> + (write ((r0 & #x3F) | #x80)))
> + ;; \u0800 - \uFFFF
> + (if (r0 < #x10000)
> + ((write ((r0 >> 12) | #xE0))
> + (write (((r0 >> 6) & #x3F) | #x80))
> + (write ((r0 & #x3F) | #x80)))
> + ;; Supplementary characters must be converted into
> + ;; surrogate pairs before encoding.
> + (;; High surrogate
> + (r1 = ((((r0 - #x10000) >> 10) & #x3ff) + #xD800))
> + ;; Low surrogate.
> + (r2 = (((r0 - #x10000) & #x3ff) + #xDC00))
> + ;; Write both surrogate characters.
> + (write ((r1 >> 12) | #xE0))
> + (write (((r1 >> 6) & #x3F) | #x80))
> + (write ((r1 & #x3F) | #x80))
> + (write ((r2 >> 12) | #xE0))
> + (write (((r2 >> 6) & #x3F) | #x80))
> + (write ((r2 & #x3F) | #x80))))))))
> + (repeat))))
> + "Encode characters from the input buffer for Java virtual machines.")
AFAIU, this is because Java uses UTF-16 encoded strings to support
Unicode, is that right? If so, why not use encode-coding and
decode-coding to en/decode between UTF-16 and the internal
representation? AFAIR, we want to deprecate CCL, and thus using it in
new code should be avoided.
- Re: master e39cb515a10 1/4: Correctly handle non-BMP characters in Android content file names,
Eli Zaretskii <=