emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Optimizing memory footprint of bidi code


From: Stefan Monnier
Subject: Re: Optimizing memory footprint of bidi code
Date: Mon, 07 Jun 2010 12:01:26 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux)

> I see a bidi_dir_t enumeration that has only 3 possible values.  Can
> we trust modern compilers to use the smallest integer type for
> enumerations?

Not that I know, no.

> If so, this part is already taken care of.  If not, how to make this
> type smaller without losing the benefits of an enumerated type (I
> don't want to manipulate magic values or macros)?

I'm not sure what you're worried about.  What's wrong with adding a ":2"
to the field?

> bits for the latter, at the time this code was designed, 9 years ago,
> there was a significant performance hit for manipulating bits on the
> then popular architectures.  Or at least that's what I thought.  These

Accessing and modifying a bitfield is and has always been slower than an
"int", yes.  Some machines (typically the P4) have relatively slow
bit-shifting operations, among other things.

> Maybe manipulating bitfields is no longer an issue with current
> architectures and compiler technology.  Maybe this problem didn't

I think the issue is not so much whether bitfield access has gotten
faster, but that processors in general have gotten faster, while memory
has sped up much less.  So in many cases it's worthwhile to do a bit
more work if it lets you access less memory.  In many cases (tho
obviously not all), time can be estimated fairly well by looking at the
amount of memory that's accessed, treating the actual computation as
being completely free.

> Can someone "in the know" tell what is the current state of art in
> these matters?

I'm not sure I'm "in the know" enough, but the general rule is that if
those fields amount to a significant quantity of memory (as compared to
the "working set" within the various levels of caches) and if those
fields are often accessed at about the same time (at the time-sale of
the corresponding level of cache), then packing them in bitfelds can be
a win.
If they're not accessed at the same time, then you don't save any memory
(you still bring in a whole cache line to access either the int or the
bitfield).
Also if you almost constantly access the same fields (i.e. you'd almost
want them in registers), then bitfields are probably not a good choice.

>   . struct bidi_it weighs in at 712 bytes on a 64-bit machine.  Can
>     this amount of memory cause more frequent cache misses, with
>     today's cache sizes?

If there's only one such struct used, then it's probably small enough
that it doesn't matter.

Typical L1 caches are in the 16KB range ("have been and always will be"
it seems: apparently smaller would make them useless and larger makes
them slow enough that it's worthwhile to add a smaller lower-level
cache), so 712 bytes are not lost in the noise at that scale, but they
fit well.

>   . Out of 712 bytes, the level stack takes 512.  The Unicode Standard
>     requires support for 62 levels (I will remove the extra 2 when I
>     have time), but I have never seen more than 4 used in any
>     real-life text.  I don't want to give up full compliance with
>     UAX#9, and I don't want to allocate memory from the heap for each
>     extra level.  But maybe starting with a 4-level stack and
>     allocating the other 58 when the 5th is required will be a good
>     enough optimization, leaving us with a 250-byte structure for all
>     but the very extreme use-cases?

That's not worth the trouble: as long as these extra bytes are contiguous
and are not accessed they only cost "main memory" but don't slow us down
(they won't be brought into the cache at all).

>   . OTOH, the code on bidi.c consults a much larger char-table of
>     bidirectional properties _for_each_character_it_processes_.  Given
>     this, how significant is the impact of a few hundred bytes in
>     struct bidi_it for causing a possible increase in cache misses?

For a single bidi_it, with fields accessed "all the time", you're
probably better off with int than with bitfields (bitfields will slow
you down everytime you access them, and they'll mostly stay in cache if
they're used often enough, so their cost is only in terms of pushing out
other useful things from the cache).

Of course, only measurements can give reliable answers.


        Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]