[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: String handling in xwidget primitives
From: |
Eli Zaretskii |
Subject: |
Re: String handling in xwidget primitives |
Date: |
Fri, 29 Jan 2016 22:25:15 +0200 |
> From: address@hidden
> Date: Fri, 29 Jan 2016 20:25:21 +0100
>
> I briefly tested this:
>
> (xwidget-webkit-execute-script (xwidget-at 0) "alert('𝌆')")
>
> where 𝌆 is some kind of unicode char i stole from
>
> https://mathiasbynens.be/notes/javascript-encoding
> this page seems to indicate utf-16 is used.
I've seen such claims. But they cannot be true, since if they were,
we couldn't have passed pure ASCII strings to those interfaces without
triggering weird errors: each ASCII character takes 2 bytes in UTF-16,
not one.
I think UTF-16 is used internally to represent strings, but the script
itself should not be in UTF-16. I think it should be either in UTF-8
(and then requires a BOM), or it should include the charset= metadata
to indicate its encoding.
> I executed the code in a buffer containing a webkit instance, and the
> char showed up in an alert box originating from the wekit instance.
>
> This doesnt actually prove anything, but it does seem to show that in my
> case on my machine and environment, at least something goes right.
Sheer luck: you just didn't bump into all those subtleties which make
the internal representation of strings in Emacs be a superset of
UTF-8, but not exactly UTF-8.
> If we do need to encode, do you know some part of the emacs src i can
> see which functions to use?
It depends how we need to encode. In general,
code_convert_string_norecord is the most frequently used function in
these cases.