Re: [PATCH] Significant performance improvements to locate, struct order

bug-findutils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Significant performance improvements to locate, struct order

From:	Eric Blake
Subject:	Re: [PATCH] Significant performance improvements to locate, struct order optimizations
Date:	Wed, 16 May 2018 09:36:30 -0500
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0

On 05/16/2018 02:35 AM, Bernhard Voelker wrote:

    1. By default, I replaced the strstr implementation with the
       fast_strstr implementation by Raphael Javaux (BSD 3-Clause,
       https://github.com/RaphaelJ/fast_strstr).


Which states:

"Its worst case complexity (O(n × m) where n is the length of the stringand m the length of the searched sub-string) is the same as the naivebrute-force algorithm but it mostly runs with a linear complexity (O(n))on most strings."

However, gnulib's strstr() implementation has a worst-case complexity ofO(n+m) (linear, not quadratic). Thus, fast_strstr() is NOT alwaysfaster, given pathological input.

There may be ways to speed up gnulib's strstr(). gnulib was the firstwidely-available linear implementation of strstr() that I'm aware of (atthe time I contributed it, I could not find any other open sourceimplementation that was linear) - but that does not mean it is thefastest. At one point, gnulib's version was imported to glibc (aslinear is better than quadratic for large string searches, although thestartup penalties are noticeable on small strings); later, glibc addedheuristics to further speed up their implementation to dynamicallyswitch between algorithms to avoid the penalties on small strings whilestill remaining linear on large strings. I'm not sure if all of thesubsequent glibc improvements have been ported back to gnulib. gnulibalready rejects libc strstr() that are quadratic (which I suspect isstill the case with MacOS) - but that becomes a topic for gnulib todecide if we should also be replacing libc strstr() when we can provethat our implementation is faster, without forcing the replacement (andrisking speed penalties) on other systems.


First of all, performance improvements should *never* be done in
one patch,


Make that: s/performance/multiple performance/

But yes, a good rule of thumb is that you should always have each patchdo one thing only, rather than cramming multiple actions into a singlepatch. You want to make the reviewer's life easier, even if a series ofwell-divided patches adds up to more cumulative changes (due to churn onsome lines) than a single patch that combines all the changes at once.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH] Significant performance improvements to locate, struct order optimizations, Devin Hussey, 2018/05/12
- Re: [PATCH] Significant performance improvements to locate, struct order optimizations, Bernhard Voelker, 2018/05/16
  - Re: [PATCH] Significant performance improvements to locate, struct order optimizations, Eric Blake <=

Prev by Date: Re: [PATCH] Significant performance improvements to locate, struct order optimizations
Previous by thread: Re: [PATCH] Significant performance improvements to locate, struct order optimizations
Index(es):
- Date
- Thread