[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Converting a part of byte vector to UTF-8 string
From: |
Nala Ginrut |
Subject: |
Re: Converting a part of byte vector to UTF-8 string |
Date: |
Wed, 15 Jan 2014 12:59:16 +0800 |
hi there!
On Tue, 2014-01-14 at 00:17 +0100, Panicz Maciej Godek wrote:
> Another option would be to use
> (substring (utf8->string buffer 0 n))
>
> This one works, but according to the manual, the
> string is "newly allocated", so it's unnecessary overhead.
>
Actually, substring is COW(copy-on-write), so you don't have to be
worried. And you may try substring/shared which won't allocate at all.
But please be careful the side-effect in you context ;-)
> What would be the best solution?
>
IMO, no matter you us substring or substring/shared in this context, you
have to allocate a new string. The reason is we don't have something
like bytevector/shared.
But IIRC bytevector in Guile is similar with C array, which means you
can avoid any allocation when you try to slice a bytevector if you can
handle the array pointer properly.
So one may take advantage of it.
!!But I can't say you can avoid allocation when you convert bytevector
to string, because either utf8->string or pointer->string will allocate
anyway.
(Anyone correct me please if I'm wrong!)
Here's my black magic:
-------------------------------cut------------------------------
(use-modules (system foreign)) ; to handle the C pointer
(define* (bv->string/partly bv #:optional (start 0)
(end #f)
(size 1)
(encoding "utf-8"))
(let ((len (if end (* size (- end start))
(- (bytevector-length bv) (* size start))))
(addr (+ (pointer-address (bytevector->pointer bv))
(* size start))))
(pointer->string (make-pointer addr) len encoding)))
-------------------------------end--------------------------------
;;(define bv (string->utf8 "我了个去啊"))
;; NOTE: Chinese character needs size==3
(bv->string/partly bv 2 4 3)
==> "个去"
;; And for common latin character whose size==1
;;(define bv2 (string->utf8 "hello world"))
(bv->string/partly bv 0 5)
==> "hello"
But I have a give a warning again, when you try to avoid allocation
overhead, you have to face the risk of the side-effect. To me, I'd
prefer pure-functional. ;-P
> TIA
> M