bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13135: Loss of data while copying


From: Pádraig Brady
Subject: bug#13135: Loss of data while copying
Date: Sat, 15 Dec 2012 03:15:29 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1

On 12/10/2012 08:43 PM, John Reiser wrote:
On 12/10/2012 10:03 AM, Eric Blake wrote:
On 12/10/2012 10:06 AM, John Reiser wrote:
Yes, because a count was specified,
dd will operate in its default awkward but POSIX specified mode
of counting each read() call, even if it returned less than specified.
This is especially noticeable with pipes:

So this bug report is really about the execrable documentation for 'dd'.
Despite similar complaints appearing yearly [or so],
the text of "info dd" does not contain the string "pipe".  SHAME ON COREUTILS.
Explaining the most common error, and how to avoid it, certainly does
belong in the documentation.  The purpose of documentation is to *FACILITATE*
the correct use of the tool, and not merely to erect the minimal legal defense
of the code.

We've tried really hard to make this issue obvious.
Even going to the effort of auto prompting the user
to use iflag=fullblock.
The full discussion of the awkward auto suggestion logic
can be seen in http://bugs.gnu.org/7362

In more "normal" cases users will get the warning:

$ yes blah | src/dd of=/dev/null bs=100001 count=10000
dd: warning: partial read (53248 bytes); suggest iflag=fullblock

We didn't prompt in this case because it's
a bit of an edge case in that ibs is specified
rather than bs. So since there is write aggregation
in that case and to support use cases like the following,
we don't warn here:

$ (echo part1; sleep 1; echo part2; sleep 1; echo discard) |
  dd count=2 ibs=4096 obs=1 2>/dev/null

Rather than complaining, how about you submit a patch to improve the
documentation?


diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 21400ad..c2282eb 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -8055,6 +8055,7 @@ OS/360 JCL.
  @item address@hidden
  @opindex if
  Read from @var{file} instead of standard input.
+(If the input is a pipe then see @samp{fullblock} below.)

I think I'll move the warning to count=
as it's mostly an issue when that is specified.


  @item address@hidden
  @opindex of
@@ -8397,6 +8398,9 @@ may return early if a full block is not available.
  When that happens, continue calling @code{read} to fill the remainder
  of the block.
  This flag can be used only with @code{iflag}.
+If the input is a pipe and argument @samp{count=} also is specified,
+then probably @samp{iflag=fullblock} should be used
+in order to prevent surprises caused by short reads.

How about this instead?

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 5f8fad7..b916a86 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -8117,6 +8117,11 @@ Copy @var{n} @samp{ibs}-byte blocks from the input file,
 of everything until the end of the file.
 if @samp{iflag=count_bytes} is specified, @var{n} is interpreted
 as a byte count rather than a block count.
+Note if the input may return short reads as could be the case
+when reading from a pipe for example, @samp{iflag=fullblock}
+will ensure that @samp{count=} corresponds to complete input blocks
+rather than the traditional POSIX specified behavior of counting
+input read operations.

 @item address@hidden
 @opindex status
@@ -8397,6 +8402,10 @@ may return early if a full block is not available.
 When that happens, continue calling @code{read} to fill the remainder
 of the block.
 This flag can be used only with @code{iflag}.
+This flag is useful with pipes for example
+as they may return short reads. I that case,
+this flag is needed to ensure that a @samp{count=} argument is
+interpreted as a block count rather than a count of read operations.

 @item count_bytes
 @opindex count_bytes





reply via email to

[Prev in Thread] Current Thread [Next in Thread]