bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#47883: sort -o loses data when it crashes


From: Peter van Dijk
Subject: bug#47883: sort -o loses data when it crashes
Date: Sun, 18 Apr 2021 19:46:01 +0200
User-agent: Cyrus-JMAP/3.5.0-alpha0-273-g8500d2492d-fm-20210323.002-g8500d249

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html: -o  output
    Specify the name of an output file to be used instead of the standard 
output. This file can be the same as one of the input files.

https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html: 
"data may be lost if the system crashes or sort encounters an I/O or other 
serious error while a file is being sorted in place" and "sort with --merge 
(-m) can open the output file before reading all input"

While the manual (but not the manpage) mentions the data loss, I think it would 
be great if sort did not have this problem at all, and I think the OpenGroup 
text also says it should not have this problem. I looked around, and a lot of 
software does get this right (by opening a randomly-named temp file to write 
to, and only moving it into place when it is written successfuly) - GNU sed -i, 
OpenBSD sort, and surely there are more. As a bonus, doing this would also make 
the `-o someinputfile -m` case safe.

Reproduction of the data loss is easy:

$ seq 10000 > 10000 ; prlimit --fsize=10 sort -R -o 10000 10000 ; wc -l 10000
File size limit exceeded (core dumped)
2 10000


(coreutils shuf has the same problem even though not all code appears to be 
shared - for example, sorts open the file for writing even before it opens it 
for reading, while shuf reverses the order of those two operations. That 
difference makes no difference in the effect, though.)

-- 
  Peter van Dijk
  peter@7bits.nl





reply via email to

[Prev in Thread] Current Thread [Next in Thread]