bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#34923: Message/race bug in 'dd'


From: Daniel A. Gauthier
Subject: bug#34923: Message/race bug in 'dd'
Date: Tue, 19 Mar 2019 20:41:44 -0400

Coreutils / DD / stderr output warning / Timing or count error:

I may or may not have reported this bug many years (decade?) ago and just
noticed it's still there.

Correct example.

dd
<data>
^D
0+1 records in
0+1 records out
Other timing/byte counts, kb/s, etc., in this case 7 bytes (the
"<data>" is literal) - the +1 is due to short block.

This works fine as long as you aren't using "conv=sync,noerror".

If you are, however, then do a:

dd conv=sync,noerror if=file-with-bad-spots.dat
of=newfile-without-,-hopefully.dat
(sometime after...)
0+0 records in
0+0 records out
dd: read error on file-with-bad-spots.dat
Usual other timing/byte counts, etc. are correct (AFAIK)
(-- a few minutes goes by ... --)
0+1 records in
0+0 records out
dd: read error (again, yadayadayada)
(many more minutes...)
0+2 records in
0+0 records out
Timing, counts, etc.
Total bytes per the file copied, etc. all appear to be correct and the
"+nn+ is the number of bad spots.

NOTICE that the "+nn" value on the line is always one off.  It says +0
after the first error, +1 after the second, etc. until the correct count
of error/short blocks is given at the end.  My utility program monitors
the output of dd's stderr and prints a colored warning message whenever
the "+nn" count goes up, but it doesn't go up right away.  Even
worse, if you are using USR1 to signal on a regular schedule, as I do,
the error is indicated to the program long after the error actually
occurred and the included byte count does NOT point to the bad spot. 
(The subsequent USR1 signals indicate the current error count, but the
one that was printed WITH the error message does not)

I hope there's no historical reason for this, I might be able to test
other versions to see if it's there also.  It just seems to me that
from a design perspective, I see a number of reasons why the text
printed to stderr at the time the error is encountered should be
consistent and reflective of the error that caused it to be printed, and
I have no reason why the phase of the number stream versus the input
stream should matter to anything else.  Combine that with the fact that
it appears it should be an easy change and I'm hoping someone familiar
with the code could kick out a patch in 3 or 4 minutes that'd take me
much longer having no familiarity with the code.  (But if I ever get
free time I might take a look).

The obvious assumption is that a "++err_count" line needs to be moved
up 1 or 2 lines.

I've had various accounts on tracking systems over the years, but my
attention span is really short.

Daniel A. Gauthier
aka Fractal








reply via email to

[Prev in Thread] Current Thread [Next in Thread]