[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: efficient implementation of string-replace-substring / string-replac
From: |
Mark H Weaver |
Subject: |
Re: efficient implementation of string-replace-substring / string-replace-all |
Date: |
Fri, 13 Sep 2013 15:41:16 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) |
Hi Arne,
Arne Babenhauserheide <address@hidden> writes:
> For wisp I created an efficient implementation of substring replacement and
> thought it might be useful for guile in general.
>
> I optimized it a bit to get rid of (likely) quadratic behaviour:
>
>
> ; ,time (string-replace-substring (xsubstring "abcdefghijkl" 0 99999) "def"
> "abc")
> ; 1.140127s real time, 1.139714s run time. 0.958733s spent in GC.
> ; 0.885618s real time, 0.885350s run time. 0.742805s spent in GC.
> ; second number after multiple runs
Here, you're including (xsubstring "abcdefghijkl" 0 99999) in the
benchmark. Better to (define big (xsubstring "abcdefghijkl" 0 99999))
first, and then: ,time (string-replace-substring big "def" "abc")
> (define (string-replace-substring s substring replacement)
> "Replace every instance of substring in s by replacement."
> (let ((sublen (string-length substring)))
> (let replacer
> ((newstring s)
> (index (string-contains s substring)))
> (if (not (equal? index #f))
> (let ((replaced (string-replace newstring replacement index
> (+ index sublen))))
> (replacer replaced (string-contains replaced substring
> index)))
> newstring))))
Here's an implementation that does this benchmark about 80 times faster
on my machine: (20 milliseconds vs 1.69 seconds)
--8<---------------cut here---------------start------------->8---
(define* (string-replace-substring s substr replacement
#:optional
(start 0)
(end (string-length s)))
(let ((substr-length (string-length substr)))
(if (zero? substr-length)
(error "string-replace-substring: empty substr")
(let loop ((start start)
(pieces (list (substring s 0 start))))
(let ((idx (string-contains s substr start end)))
(if idx
(loop (+ idx substr-length)
(cons* replacement
(substring s start idx)
pieces))
(string-concatenate-reverse (cons (substring s start)
pieces))))))))
--8<---------------cut here---------------end--------------->8---
The reason this is so much faster is because it avoids needless
generation of intermediate strings.
Regards,
Mark