emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: master 669aeaf: Prefer make_nil_vector to make-vector with nil


From: Pip Cet
Subject: Re: master 669aeaf: Prefer make_nil_vector to make-vector with nil
Date: Sat, 15 Aug 2020 19:53:32 +0000

On Sat, Aug 15, 2020 at 6:48 PM Paul Eggert <eggert@cs.ucla.edu> wrote:
> On 8/12/20 6:05 AM, Pip Cet wrote:
>
> > bugs like the one we currently have in fns.c (Fdelete allocates an
> > uninitialized vector, then calls Fequal, which can call quit, leaving
> > a half-initialized vector on the stack and potentially marked by
> > conservative GC) are bound to bite us one day
>
> Good catch; I hadn't noticed this GC-related bug. I looked through the Emacs
> source code for similar instances of the bug, and fixed the ones I found with
> the first attached patch.

I'm not sure all of those are strictly necessary, but I'm all in favor. Thanks.

> > Particularly for
> > small-ish fixed-size vectors, it seems to me uninitialized vectors are
> > more trouble than they're worth (in fact, I could see make_vector (n,
> > Qnil) being faster on many CPUs, because the cache lines are written
> > to completely and don't have to be brought in from RAM.
>
> I don't quite follow. If Fmake_vector (n, Qnil) invokes memset to clear the
> memory, then it should work the same way as (make_uninit_vector followed by
> immediate initialization) as far as the hardware cache is concerned.

It depends on whether "immediate" is immediate enough for the
hardware. IIRC, the last CPU whose documentation I looked at closely
allowed only register operations between the movq insns, so

v = make_uninit_vector (n);
for (i = 0; i < n; i++)
  ASET (v, i, AREF (w, i));

would be enough to force the cache line containing v->contents to be
read, whereas

v = make_nil_vector (n);
for (i = 0; i < n; i++)
  ASET (v, i, AREF (w, i));

would combine the initializing zero writes to that cache line, notice
the previous cache line contents had become irrelevant, and never
actually fault in the cache line.

Needless to say, with code like

for (i = 0; i < n; i++)
  ASET (v, i, Fcomplicated_function (AREF (w, i)));

no CPU will be able to decide that the cache line containing
v->contents will be written to completely.

IOW, memset is okay both on the CPU I dimly recall and modern actual
CPUs, copying data from another vector might or might not be okay
depending on the CPU, anything more complicated than that is likely
not okay on any current CPU.

> And if
> Fmake_vector (n, Qnil) doesn't invoke memset but instead relies on calloc 
> (which
> in turn just mmaps /dev/zero or whatever), then Fmake_vector should be slower
> than uninitialized vectors due to the cache effects that you mention.

I've not said anything about the case of pre-cleared memory; memset
can be faster than overwriting memory one word at a time, because
cache lines' previous contents become irrelevant in a way that's
obvious enough for the CPU to catch.

> (If this
> is a significant performance issue with Fmake_vector then we should fix it, 
> but
> that issue is independent of this discussion.)

I agree. I just don't think it's at all obvious that
make_uninit_vector is worth it even if it's used perfectly, and that
erroneous usage tips the scales in favor of dropping it.

> If we changed hash_table_contents to use Fcopy_sequence, wouldn't
> hash_table_rehash become a bit slower? hash_table_rehash would need to look at
> the entire sequence instead of just its first first h->count elements.

Yeah, but the slight slowdown would be worth it to keep the code
simpler, IMHO. OTOH, I'm working on the more ambitious hash table
rewrite and it would remove the key_and_value vector entirely, so I'd
rather focus on that and go back to optimizing the current hash table
implementation if the rewrite proposal is shot down.

> That being said, you're right that the Emacs code was using make_uninit_vector
> too often, even aside from the GC bugs noted above. Inspired by your comment, 
> I
> went through the Emacs source and replaced all calls to make_uninit_vector 
> (and
> allocate_vector) that were easy to replace and didn't seem to make any
> performance difference, by installing the second attached patch.

Thank you!

> This patch isn't as aggressive as your comment suggested, though, as it 
> doesn't
> replace code like this from charset.c:
>
>    val = make_uninit_vector (8);
>    for (i = 0; i < 8; i++)
>      ASET (val, i, make_fixnum (code_space[i]));
>
> Here, changing make_uninit_vector to make_nil_vector initializes 
> unnecessarily,
> there's no chance of GC before the vector is initialized, and readability is 
> not
> significantly improved by changing the code to something like the following:
>
>    val = CALLN (Fvector,
>                make_fixnum (code_space[0]), make_fixnum (code_space[1]),
>                make_fixnum (code_space[2]), make_fixnum (code_space[3]),
>                make_fixnum (code_space[4]), make_fixnum (code_space[5]),
>                make_fixnum (code_space[6]), make_fixnum (code_space[7]));

I agree the hypothetical code is much less readable, and your patches
(modulo the fns.c problems) look fine to me. Thank you again!



reply via email to

[Prev in Thread] Current Thread [Next in Thread]