[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#47883: sort -o loses data when it crashes
From: |
Peter van Dijk |
Subject: |
bug#47883: sort -o loses data when it crashes |
Date: |
Sun, 18 Apr 2021 19:46:01 +0200 |
User-agent: |
Cyrus-JMAP/3.5.0-alpha0-273-g8500d2492d-fm-20210323.002-g8500d249 |
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html: -o output
Specify the name of an output file to be used instead of the standard
output. This file can be the same as one of the input files.
https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html:
"data may be lost if the system crashes or sort encounters an I/O or other
serious error while a file is being sorted in place" and "sort with --merge
(-m) can open the output file before reading all input"
While the manual (but not the manpage) mentions the data loss, I think it would
be great if sort did not have this problem at all, and I think the OpenGroup
text also says it should not have this problem. I looked around, and a lot of
software does get this right (by opening a randomly-named temp file to write
to, and only moving it into place when it is written successfuly) - GNU sed -i,
OpenBSD sort, and surely there are more. As a bonus, doing this would also make
the `-o someinputfile -m` case safe.
Reproduction of the data loss is easy:
$ seq 10000 > 10000 ; prlimit --fsize=10 sort -R -o 10000 10000 ; wc -l 10000
File size limit exceeded (core dumped)
2 10000
(coreutils shuf has the same problem even though not all code appears to be
shared - for example, sorts open the file for writing even before it opens it
for reading, while shuf reverses the order of those two operations. That
difference makes no difference in the effect, though.)
--
Peter van Dijk
peter@7bits.nl
- bug#47883: sort -o loses data when it crashes,
Peter van Dijk <=