[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Use unaligned memory accesses unconditionally
From: |
Fangrui Song |
Subject: |
Re: [PATCH] Use unaligned memory accesses unconditionally |
Date: |
Tue, 8 Aug 2023 21:44:16 -0400 |
On Tue, Aug 8, 2023 at 4:12 AM Jean Delvare <jdelvare@suse.de> wrote:
>
> Hi Fangrui,
>
> On Wed, 2 Aug 2023 17:18:46 +0000, Fangrui Song via dmidecode-devel wrote:
> > Currently ALIGNMENT_WORKAROUND is only defined for __ia64__ and __arm__.
> > However, -fsanitize=alignment (part of UndefinedBehaviorSanitizer) will
> > give errors for other architectures like x86. Modern compilers are able
> > to optimize the memory access, so let's just use unaligned memory
> > accesses unconditionally.
>
> I'm not sure what qualifies as a "modern compiler" for you, but on my
> up-to-date openSUSE Leap 15.4 system, gcc 7.5.0 x86_64 builds very
> different, much larger, and 0.8% slower code with
> -DALIGNMENT_WORKAROUND:
>
> add/remove: 0/0 grow/shrink: 9/0 up/down: 5099/0 (5099)
> Function old new delta
> dmi_table_decode 24440 27760 +3320
> dmi_decode_oem 8475 9806 +1331
> dmi_print_cpuid 434 626 +192
> smbios3_decode 369 481 +112
> smbios_decode 569 625 +56
> legacy_decode 300 348 +48
> dmi_processor_family 536 560 +24
> dmi_processor_frequency 86 94 +8
> dmi_get_cpuid_type.isra 379 387 +8
> Total: Before=60630, After=65729, chg +8.41%
>
> Therefore I'm not willing to apply your patch.
>
> --
> Jean Delvare
> SUSE L3 Support
Sorry for assuming that the compilers were smart enough.
I confirm that with either GCC or Clang, there are sequential add
instructions for WORD.
On the other side, I have confirmed that unaligned memory access via
memcpy compiles to optimal code, for {x86_64,aarch64}x{clang,gcc}, and
sent V2.
--
宋方睿