coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SV: Wish: checksumming *sum filter


From: Assaf Gordon
Subject: Re: SV: Wish: checksumming *sum filter
Date: Fri, 10 May 2019 08:36:50 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1

Hello,

Ole Tange wrote:

I have a couple of time wanted a checksumming filter, that will work like cat if the input matches the checksum and output nothing if it does not:

Get_untrusted_input | sha256sum bf794518e35d7f1ce3a50b3058c4191bb9401e568fc645d77e10b0f404cf1f22 |

do something on the stream I now know matches the sum Could we extend the *sum programs to support that?

As a interesting side note, OpenBSD's "signify" program
can work in a similar way, so there is some real-world use case
for this scenario. The man-page (https://man.openbsd.org/signify) gives
the following example:

   Verify a gzip pipeline:
       $ ftp url | signify -Vz -t arc | tar ztf -

(of course, using it is slightly more complicated and more secure than
comparing a checksum string).


On 2019-05-10 7:13 a.m., Ole Tange wrote:
Pádraig wrote:

The fact that *sum would need to consume/buffer all the input would mean that 
the parallelism
from the rest of the pipe is lost (well I suppose the process startup overhead 
is parallelized).

If Ole's use-case is similar to signify, then it's not so much about parallelism of the pipe, but more about saving keystrokes / multiple commands.


Just like sort/wc and *sum if used in a pipe today:

     cat file | sha256sum | ...

I.E. it's functionally equivalent to:

  Get_untrused_input > /tmp/blah
  sha256sum -c <(echo "$chksum  /tmp/blah") &&
  ...
  rm /tmp/blah

Yes, except you can do it on a read-only file system and in a shell that does not 
support <().


As Pádraig wrote, the data needs to be stored "somewhere" while the
checksumming is happening and before the data is sent down the pipe
to the next program.

If the input is a regular file (not pipe from another program's STDOUT),
then it can be rewound.

If the input is not a file, then its content needs to be stored
elsewhere - and if this is a restricted system (read only filesystem) -
you'll need to store it in memory (/dev/shm or tmpfs could help).

I'm attaching an example script that does the above.
It works on Alpine Linux (using 'ash' as shell' and busybox as
grep/printf/mktemp/sha256sum). Not well tested, use at your own risk.

Usage example:
    $ echo "hello world" > 1.txt
    $ sha256sum 1.txt
    8cd07f3a5ff98f2a78cfc366c13fb123eb8d29c1ca37c79df190425d5b9e424d  1.txt

  Test with incorrect checksum:
$ cat 1.txt | ./sha256filter.sh 8cd07f3a5ff98f2a78cfc366c13fb123eb8d29c1ca37c79df190425d99999999 | wc -l
    sha256filter.sh: error: input does not match expected checksum
    0

  With correct checksum:
$ cat 1.txt | ./sha256filter.sh 8cd07f3a5ff98f2a78cfc366c13fb123eb8d29c1ca37c79df190425d5b9e424d | wc -l
    1


regards,
 - assaf


Attachment: sha256filter.sh
Description: application/shellscript


reply via email to

[Prev in Thread] Current Thread [Next in Thread]