bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

xalloc.h (x2nrealloc): Don't always double the buffer size.


From: Jim Meyering
Subject: xalloc.h (x2nrealloc): Don't always double the buffer size.
Date: Fri, 02 Feb 2007 00:52:55 +0100

For background, this thread started here:

  http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/9581

Andi Kleen <address@hidden> wrote:
>> [*] Can you even get a 2GB-long string into expr?
>
> At least on Linux you can't, the kernel limits arguments and environments
> to 128K or so. There are some patches floating around to enlarge that
> limit, but I don't think it will be ever 2GB because it would be
> too risky to pin that much memory unswappable.
>
> I think the others can only get regexprs from command line options too.
> It would only make sense if you have a option to input the regexpr
> from a file, which neither expr nor nl nor tac have.
>
> So I guess at least on Linux this doesn't make sense.

The length of the string being searched matters.  More precisely,
it's the offsets (into the buffer being searched) of the beginning
and end of the match.

I wanted to prove to myself that I could exploit the bug
when using the libc regexp, so cooked up this example with nl:

First, show how nl's "-b pREGEX" option works:
(number only lines matching REGEXP)

  $ seq 5|sed s/5/:/|nl -b p:
         1
         2
         3
         4
       1  :

Create a sparse file, exactly 2GB long:

  $ dd if=/dev/null of=2gb bs=1M seek=2k count=0

choose a long marker string so the search will go faster:

  $ re=$(printf %01000d 0)

append the marker to the file (the start-of-marker will have offset 2^31),
which would overflow a 32-bit signed type:

  $ echo $re >> 2gb

Run nl, so that it numbers the sole (2^31+-byte-long) line
matching the regexp,

  $ nl -b p$re 2gb > k

That failed on a 64-bit system with 2GB ram because linebuffer.c
tried to allocate a buffer longer than 4GB (it may have been 8GB),
and there wasn't enough virtual memory.

linebuffer.c uses xalloc.h's x2realloc function, which was
doubling the buffer size whenever necessary.  Obviously, when the
buffer is very large, doubling is wasteful.  Here, it was fatal.

I've just changed xalloc's x2nrealloc to do n = 3n/2, rather than n *= 2,
and with that change, the above invocation of "nl" completed:
[BTW, even with --without-included-regex, the above nl command worked
 fine.  That's because nl doesn't use the would-have-overflowed offsets.
 However, it looks like ptx *does*.  So, users of ptx with very long
 lines, beware! :-) ]

        Give tools a better chance to allocate space for very large buffers.
        * lib/xalloc.h (x2nrealloc): Use 3/2, not 2, as buffer size factor.

Index: lib/xalloc.h
===================================================================
RCS file: /sources/gnulib/gnulib/lib/xalloc.h,v
retrieving revision 1.32
diff -u -p -r1.32 xalloc.h
--- lib/xalloc.h        8 Nov 2006 00:22:30 -0000       1.32
+++ lib/xalloc.h        1 Feb 2007 23:32:57 -0000
@@ -1,7 +1,7 @@
 /* xalloc.h -- malloc with out-of-memory checking

    Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
-   1999, 2000, 2003, 2004, 2006 Free Software Foundation, Inc.
+   1999, 2000, 2003, 2004, 2006, 2007 Free Software Foundation, Inc.

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
@@ -139,10 +139,10 @@ xnrealloc (void *p, size_t n, size_t s)
    allocating an initial block with a nonzero size, or by allocating a
    larger block.

-   In the following implementation, nonzero sizes are doubled so that
-   repeated reallocations have O(N log N) overall cost rather than
-   O(N**2) cost, but the specification for this function does not
-   guarantee that sizes are doubled.
+   In the following implementation, nonzero sizes are increased by a
+   factor of approximately 1.5 so that repeated reallocations have
+   O(N log N) overall cost rather than O(N**2) cost, but the
+   specification for this function does not guarantee that rate.

    Here is an example of use:

@@ -204,9 +204,9 @@ x2nrealloc (void *p, size_t *pn, size_t
     }
   else
     {
-      if (((size_t) -1) / 2 / s < n)
+      if ((2 * (((size_t) -1 - 1) / 3)) / s < n)
        xalloc_die ();
-      n *= 2;
+      n = n + n / 2 + 1;
     }

   *pn = n;




reply via email to

[Prev in Thread] Current Thread [Next in Thread]