According to Matthew Woehlke on 5/29/2009 4:59 PM:
Add to DESCRIPTION
Implementations shall behave as if they read the memory byte by byte
from the beginning of the bytes pointed to by s and stop at the first
occurrence of c.
Doesn't that preclude *any* sort of optimization? Or is it always safe
to read up to the end of a word boundary?
The as-if rule is very powerful. For example, this wording explicitly
permits the x86_64 implementation that Ulrich checked into glibc on May 21
(commit fa64b7f), where the assembly code uses speculative preloads of
cache lines at a time, reading many bytes in advance of the memory
actually belonging to the pointer, since a failed speculative load across
a page boundary is still safe if a match is later found up front.
And if others agree with me that we need to provide a gnulib memchr
replacement for installations that are using a glibc version that predates
last week (and thus causes problems with higher level algorithms such as
strstr), the replacement will be C code that scans an aligned word at a
time, similar to how it is already done in memchr2.c. Not quite as
efficient as hand-tuned assembly, but hands down faster than a byte at a time.