[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#22357: grep -f not only huge memory usage, but also huge time cost
From: |
Norihiro Tanaka |
Subject: |
bug#22357: grep -f not only huge memory usage, but also huge time cost |
Date: |
Tue, 20 Dec 2016 07:38:34 +0900 |
On Sun, 18 Dec 2016 23:48:10 -0800
Paul Eggert <address@hidden> wrote:
> >> 'delete' is
> >> O(N); 'replace' calls 'delete' in a loop and is therefore O(N**2).
> >> 'epsclosure' calls 'replace' in a loop and so I suppose it is O(N**3).
> >> I haven't looked into how likely the worst-case performance is, though.
>
> > Yes. It is same both before and after with my proposed patch, but It
> > seems that memset() for VISITED causes slowdown in before code.
>
> I looked into it some more. Your patch looks like a win so I
> installed it; thanks. Also, I improved the worst-case performance for
> 'replace' from O(N*(N + log N)) to O(N log N), by installing the
> attached patch. I also bumped grep's gnulib version so that it gets
> this patch.
>
> This doesn't address the main problem in this area, as 'grep' is
> still pretty slow with lots of multiple patterns. However, one step
> at a time.
Thanks.
BTW, What case do you the worst? One more I think previous 'replace' is
not O(N*(N + log N)) but O(N + N log N) i.e. O(N log N) .
> This doesn't address the main problem in this area, as 'grep' is still
> pretty slow with lots of multiple patterns. However, one step at a time.
Yes, I think slowness with lots of multiple patterns is caused by two
reasons mainly.
1. dfa is not optimize parsed patterns.
For example, dfa does not replace pattern 'a \|a \|a \| ... \|a \|b'
to 'a \|b'.
I tried it several times, have not succeeded it yet.
2. grep matcher compiles regex always to check syntax of a pattern,
even when it is not used in searching.
I also tried it, have not succeeded it yet.
- bug#22357: grep -f not only huge memory usage, but also huge time cost, (continued)
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/10
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Trevor Cordes, 2016/12/11
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/11
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Bruno Haible, 2016/12/11
- bug#22357: grep -f not only huge memory usage, but also huge time cost, arnold, 2016/12/11
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/11
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Bruno Haible, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/14
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/17
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/19
- bug#22357: grep -f not only huge memory usage, but also huge time cost,
Norihiro Tanaka <=
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/19
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/20
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Trevor Cordes, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Trevor Cordes, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/12
- bug#22357: grep -f not only huge memory usage, but also huge time cost, L.A. Walsh, 2016/12/15
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Jackson, 2016/12/16
bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/21