bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#41657: md5sum: odd escaping for input filename \


From: Michael Coleman
Subject: bug#41657: md5sum: odd escaping for input filename \
Date: Tue, 2 Jun 2020 23:52:38 +0000

Hi Bob,

Thanks very much for your prompt reply.  Certainly, if this is documented 
behavior, it's not a bug.  I would have never thought to check the 
documentation as the behavior seems so strange.

If I understand correctly, the leading backslash in the first field is an 
indication that the second field is escaped.  (The first field never needs 
escapes, as far as I can see.)

Not sure I would have chosen this, but it can't really be changed now.  But, I 
suspect that almost no real shell script would deal with this escaping 
correctly.  Really, I'd be surprised if there were even one example.  If so, 
perhaps it could be changed without trouble.

In any case, thanks very much for your explanation.

Regards,
Mike



-----Original Message-----
From: Bob Proulx <bob@proulx.com> 
Sent: Monday, June 1, 2020 08:53 PM
To: Michael Coleman <mcolema5@uoregon.edu>
Cc: 41657@debbugs.gnu.org
Subject: Re: bug#41657: md5sum: odd escaping for input filename \

Hello Michael,

Michael Coleman wrote:
> $ true > \\
> $ md5sum \\
> \d41d8cd98f00b204e9800998ecf8427e  \\
> $ md5sum < \\
> d41d8cd98f00b204e9800998ecf8427e  -

Thank you for the extremely good example!  It's excellent.

> The checksum is not what I would expect, due to the leading
> backslash.  And in any case, the "\d" has no obvious interpretation.
> Really, I can't imagine ever escaping the checksum.

As it turns out this is documented behavior.  Here is what the manual says:

     For each FILE, ‘md5sum’ outputs by default, the MD5 checksum, a
  space, a flag indicating binary or text input mode, and the file name.
  Binary mode is indicated with ‘*’, text mode with ‘ ’ (space).  Binary
  mode is the default on systems where it’s significant, otherwise text
  mode is the default.  Without ‘--zero’, if FILE contains a backslash or
  newline, the line is started with a backslash, and each problematic
  character in the file name is escaped with a backslash, making the
  output unambiguous even in the presence of arbitrary file names.  If
  FILE is omitted or specified as ‘-’, standard input is read.

Specifically it is this sentence.

  Without ‘--zero’, if FILE contains a backslash or newline, the line
  is started with a backslash, and each problematic character in the
  file name is escaped with a backslash, making the output unambiguous
  even in the presence of arbitrary file names.

And so the program is behaving as expected.  Which I am sure you will
not be happy about since this bug report about it.

Someone will correct me but I think the thinking is that the output of
md5sum is most useful when it can be checked with md5sum -c and
therefore the filename problem needed to be handled.  The trigger for
this escapes my memory.  But if you were to check the output with -c
then you would find this result with your test case.

  $ md5sum \\ | md5sum -c
  \: OK

And note that this applies to the other *sum programs too.

  The commands sha224sum, sha256sum, sha384sum and sha512sum compute
  checksums of various lengths (respectively 224, 256, 384 and 512
  bits), collectively known as the SHA-2 hashes. The usage and options
  of these commands are precisely the same as for md5sum and
  sha1sum. See md5sum invocation.

> (Yes, my users are a clever people.)

  I am so clever that sometimes I don't understand a single word of what I am 
saying -- Oscar Wilde

:-)

Bob

reply via email to

[Prev in Thread] Current Thread [Next in Thread]