bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#10877: Wimpy external files.


From: Paul Eggert
Subject: bug#10877: Wimpy external files.
Date: Sat, 25 Feb 2012 10:41:05 -0800
User-agent: Mozilla/5.0 (X11; Linux i686; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2

On 02/25/2012 04:56 AM, Rogier Wolff wrote:

> there is a logic error in the code that determines the
> maximum memory to use: You said it was supposed to use 1/8th of total
> memory. However it then takes another factor of two "margin".

Thanks for catching that.  I installed a fix (patch at end
of this message).

> I don't think that any guessing should be done if we cannot determine
> the filesize. In that case we have great heuristics to come up with a
> reasonable buffer size without the filesize.

A problem with that idea is, suppose we have many
independent 'sort' invocations running at at the same time,
as part of a shell pipeline say?  If they each grab 1/8 of
physical RAM, merely because they want to sort piped data of
a few bytes, they may exhaust swap space.

Perhaps we can improve the heuristics for pipes, but I hope
you can see why I'm a bit leery of a heuristic that says
"if the input is from a pipe, pretend it's from a file of
infinite size".

>From 28197ef851af8f7e4f5f98f4433090cbbd63fbac Mon Sep 17 00:00:00 2001
From: Paul Eggert <address@hidden>
Date: Sat, 25 Feb 2012 10:32:52 -0800
Subject: [PATCH] sort: default to physmem/8, not physmem/16

* src/sort.c (default_sort_size): Don't divide advice by 2.
Just divide the hard limits by 2.  This matches the comments.
Reported by Rogier Wolff in http://bugs.gnu.org/10877
---
 src/sort.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/sort.c b/src/sort.c
index 6875a6a..60ff415 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -1414,13 +1414,9 @@ default_sort_size (void)
   struct rlimit rlimit;
 
   /* Let SIZE be MEM, but no more than the maximum object size or
-     system resource limits.  Avoid the MIN macro here, as it is not
-     quite right when only one argument is floating point.  Don't
-     bother to check for values like RLIM_INFINITY since in practice
-     they are not much less than SIZE_MAX.  */
+     system resource limits.  Don't bother to check for values like
+     RLIM_INFINITY since in practice they are not much less than SIZE_MAX.  */
   size_t size = SIZE_MAX;
-  if (mem < size)
-    size = mem;
   if (getrlimit (RLIMIT_DATA, &rlimit) == 0 && rlimit.rlim_cur < size)
     size = rlimit.rlim_cur;
 #ifdef RLIMIT_AS
@@ -1439,7 +1435,11 @@ default_sort_size (void)
     size = rlimit.rlim_cur / 16 * 15;
 #endif
 
-  /* Use no less than the minimum.  */
+  /* Return the minimum of MEM and SIZE, but no less than
+     MIN_SORT_SIZE.  Avoid the MIN macro here, as it is not quite
+     right when only one argument is floating point.  */
+  if (mem < size)
+    size = mem;
   return MAX (size, MIN_SORT_SIZE);
 }
 
-- 
1.7.6.5







reply via email to

[Prev in Thread] Current Thread [Next in Thread]