[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: how to calculate the size of string in bytes?
From: |
tomas |
Subject: |
Re: how to calculate the size of string in bytes? |
Date: |
Tue, 18 Aug 2015 12:13:52 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Tue, Aug 18, 2015 at 02:11:54AM -0700, Sam Halliday wrote:
> Hi all,
>
> We've had to change the ENSIME protocol to be more friendly to other editors
> and this has meant changing how we frame TCP messages.
>
> We used to have a 6 character hex number at the start of each message that
> counted the number of multibyte characters, but we'd like to change it to be
> the number of bytes in the message.
>
> We're sending the string to `process-send-string' and `read'ing from the
> associated network buffer. But when calculating the outgoing length of the
> string that we want to send, we use `length' --- but we need this to be
> `length-in-bytes' not the number of multibyte chars. Is there a built in
> function to do this or am I going to have to iterate the string and count the
> byte size of each character?
>
> A quick test shows that
>
> (length (encode-coding-string "EURO" 'raw-text))
>
> seems to give the correct result (1 for ASCII, 2 for Pound Sterling, 3 for
> Euro), but I am not 100% sure if this is correct.
Raw is, afaik, Emacs's internal coding system. You don't want traces of it
in the network :-)
I'd expect you to use whichever coding system the network protocol prescribes
(these days it'd be UTF-8 by default). Things will (mostly) work for raw-text
since it's nearly UTF-8.
The really correct way to do this (AFAICS) would be to find out which encoding
process-send-string is going to use (via process-coding-system) and use *that*
in the length calculation -- this way you won't lie :-)
So I'd try this (slightly reordering the let*)
(let* ((msg (concat (ensime-prin1-to-string sexp) "\n"))
(coding-system (cdr (process-coding-system proc)))
(string (concat (ensime-net-encode-length (length encode-coding-string
msg coding-system)) msg))
...
It seems somewhat wasteful to encode msg (to find its length) just
to let process-send-string encode again -- perhaps there's a better
idiom around for that. The use case seems common enough. Anyone?
regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iEYEARECAAYFAlXTBWAACgkQBcgs9XrR2kYjzACfVd/+R0wNKqWVt5sXxX/9WVj2
OjQAnRRuUdorjnIjd+tpL4z7frx1JGYZ
=yjMt
-----END PGP SIGNATURE-----
- how to calculate the size of string in bytes?, Sam Halliday, 2015/08/18
- Re: how to calculate the size of string in bytes?,
tomas <=
- Re: how to calculate the size of string in bytes?, Eli Zaretskii, 2015/08/18
- Re: how to calculate the size of string in bytes?, tomas, 2015/08/18
- Re: how to calculate the size of string in bytes?, Eli Zaretskii, 2015/08/18
- Re: how to calculate the size of string in bytes?, tomas, 2015/08/18
- Re: how to calculate the size of string in bytes?, Eli Zaretskii, 2015/08/18
- Re: how to calculate the size of string in bytes?, tomas, 2015/08/18
- Re: how to calculate the size of string in bytes?, Eli Zaretskii, 2015/08/18
- Re: how to calculate the size of string in bytes?, tomas, 2015/08/18
Re: how to calculate the size of string in bytes?, Stefan Monnier, 2015/08/18