dmidecode-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Use unaligned memory accesses unconditionally


From: Fangrui Song
Subject: Re: [PATCH] Use unaligned memory accesses unconditionally
Date: Tue, 8 Aug 2023 21:44:16 -0400

On Tue, Aug 8, 2023 at 4:12 AM Jean Delvare <jdelvare@suse.de> wrote:
>
> Hi Fangrui,
>
> On Wed,  2 Aug 2023 17:18:46 +0000, Fangrui Song via dmidecode-devel wrote:
> > Currently ALIGNMENT_WORKAROUND is only defined for __ia64__ and __arm__.
> > However, -fsanitize=alignment (part of UndefinedBehaviorSanitizer) will
> > give errors for other architectures like x86. Modern compilers are able
> > to optimize the memory access, so let's just use unaligned memory
> > accesses unconditionally.
>
> I'm not sure what qualifies as a "modern compiler" for you, but on my
> up-to-date openSUSE Leap 15.4 system, gcc 7.5.0 x86_64 builds very
> different, much larger, and 0.8% slower code with
> -DALIGNMENT_WORKAROUND:
>
> add/remove: 0/0 grow/shrink: 9/0 up/down: 5099/0 (5099)
> Function                                     old     new   delta
> dmi_table_decode                           24440   27760   +3320
> dmi_decode_oem                              8475    9806   +1331
> dmi_print_cpuid                              434     626    +192
> smbios3_decode                               369     481    +112
> smbios_decode                                569     625     +56
> legacy_decode                                300     348     +48
> dmi_processor_family                         536     560     +24
> dmi_processor_frequency                       86      94      +8
> dmi_get_cpuid_type.isra                      379     387      +8
> Total: Before=60630, After=65729, chg +8.41%
>
> Therefore I'm not willing to apply your patch.
>
> --
> Jean Delvare
> SUSE L3 Support

Sorry for assuming that the compilers were smart enough.
I confirm that with either GCC or Clang, there are sequential add
instructions for WORD.

On the other side, I have confirmed that unaligned memory access via
memcpy compiles to optimal code, for {x86_64,aarch64}x{clang,gcc}, and
sent V2.


-- 
宋方睿



reply via email to

[Prev in Thread] Current Thread [Next in Thread]