[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#9321: repeated segfaults sorting large files in 8.12

From: Jim Meyering
Subject: bug#9321: repeated segfaults sorting large files in 8.12
Date: Sat, 20 Aug 2011 08:31:46 +0200

Andras Salamon wrote:

> I am seeing repeated (but not reliably repeatable) segmentation faults
> sorting datasets in the 100MB-100GB range on a 64-bit Debian system
> using GNU sort 8.12 (and also 8.9).  Stack traces seem to indicate
> problems during the merge phase, usually when the temporary files
> are being combined.
> This may or may not be related to the recent discussion about
> #9307, but I am definitely using 8.12, rebuilt with CFLAGS=-g since
> several indicative values were otherwise optimised out, configured
> with --disable-nls --disable-threads, and am running with a fixed
> buffer -S 100M and also --parallel=1 to try to isolate problems from
> possible threading issues.  I was seeing these crashes with a vanilla
> build also.
> At least one crash occurred when comparing the very last entry in
> the memory buffer to a non-existent entry, when merging large files.
> There was also a crash with total_lines=851122 in mergelines_node,
> which leads to node->hi containing what appears to be garbage, with
> length=2882303761517117516.
> The repository changelog seems to indicate that the current development
> release of sort has not changed since 8.12.  Will attempting to track
> the problem down with 8.12 be useful?

Yes, most definitely.
As Pádraig already mentioned, most useful would be instructions
showing how to reproduce the failure, even if part of that is something
like "run this command 30 times" to provoke the rare failure.

> If so I can post stack traces
> and values of relevant variables from the core dump, or post a new
> issue in the tracker, or reopen #9307.  If not, please suggest some
> specific actions I should take to generate useful information.

Thanks for the detailed report and investigation.
Have you reproduced the problem on more than one system?
If not, have you recently run any tests of your system's hardware?
It would be a shame to invest a lot of debugging effort,
if it ends up being a hardware problem with one specific system.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]