[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GUILE 2/3 and string encoding cost
From: |
Urs Liska |
Subject: |
Re: GUILE 2/3 and string encoding cost |
Date: |
Wed, 22 Jan 2020 21:32:18 +0100 |
User-agent: |
Evolution 3.34.1-2+b1 |
Am Mittwoch, den 22.01.2020, 20:28 +0000 schrieb Carl Sorensen:
>
> On 1/22/20, 1:21 PM, "lilypond-devel on behalf of David Kastrup" <
> lilypond-devel-bounces+c_sorensen=address@hidden on behalf of
> address@hidden> wrote:
>
> Han-Wen Nienhuys <address@hidden> writes:
>
> > On Wed, Jan 22, 2020 at 12:01 PM David Kastrup <address@hidden>
> wrote:
> >
> >> Han-Wen Nienhuys <address@hidden> writes:
> >>
> >> > I looked a bit through the GUILE source code to see what is
> going on.
> >> >
> >> > I believe our current hypothesis (LilyPond's slowdown is
> caused by
> >> > expensive unicode transcoding into 32-bit strings) is
> incorrect.
> >> >
> >> > If you look into the source code, you can see that the UTF-8
> -> SCM
> >> > conversion checks if there are any code points over 255
> >> >
> >> >
> >> >
> >>
> https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n1620
> >> >
> >> > if there aren't, it uses Latin1 encoding ("narrow == 1") to
> encode the
> >> > string as a normal byte array. This code walks the string
> twice, but that
> >> > is very cheap due to CPU cache locality, so it should be
> >> > essentially equivalent to whatever GUILE 1.8 was doing.
> >>
> >> GUILE 1.8 did not walk the string even once
> >>
> >
> > GUILE 1.8 walks it once when you do memcpy.
>
> Ok, but that's sort of a cheap walk.
>
> >> > Even so, if the input flie does use UTF-8, there should be
> little
> >> > overhead, because the number of texts that we process is
> always
> >> > small. LilyPond is not a text processor.
> >> >
> >> > So, what hard data do we have on GUILE 2/3 slowness, and
> what does
> >> > that data say?
> >>
> >> That data says "humongous slowdown". There is not much more
> than
> >> speculation what this is caused by as far as I know.
> >>
> >>
> > Do we have a standardized test file for benchmarking
> performance?
>
> input/regression/mozart-hrn-3.ly possibly, but it's not
> particularly
> large.
>
> We don't have a standardized test file, but we do have some
> representative results from a couple of (unknown but described)
> files:
>
> https://lists.gnu.org/archive/html/lilypond-devel/2018-10/msg00054.html
>
> Perhaps we could get those files to become standards (along with some
> other, shorter-compiling files).
>
Not right now but in the not-so-distant future I'd be able¹ to provide
the 650 examples from the Mozart violin school as a set of many small
scores, which might be a nice complement to one large score.
Urs
¹ It's not about copyright (the edition is released under a CC) but
about being ready for that purpose.
> Carl
>
>
Re: GUILE 2/3 and string encoding cost, Han-Wen Nienhuys, 2020/01/22
Re: GUILE 2/3 and string encoding cost, Han-Wen Nienhuys, 2020/01/23
Re: GUILE 2/3 and string encoding cost, David Kastrup, 2020/01/23