[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug #64253] Suggestion - Add support for libmagic and xattr
From: |
Bernhard Voelker |
Subject: |
Re: [bug #64253] Suggestion - Add support for libmagic and xattr |
Date: |
Wed, 31 May 2023 23:18:18 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.2 |
Without commenting here about -magic/-mime, i.e. just to discuss the given
statements on what is possible today.
On 5/25/23 21:18, anonymous wrote:
Currently - with find : We need xargs and sed and so have to worry about
whitespace paths and filenames, we are also spawning several sub-commands.
find -type f |
xargs file |
sed -n 's/:.*PE32 executable.*/p' |
xargs my_command
With find(1), one does not have to "worry about whitespace". There are several
safe ways to stay on the safe side:
- executing per file (which may be inefficient):
$ find ... -exec $TOOL '{}' ';'
- bulk execution:
$ find ... -exec $TOOL '{}' +
- if $TOOL understands Zero-separated input (e.g. like grep):
$ find ... -print0 | $TOOL -z
- else
$ find ... -print0 | xargs -r0 $TOOL
Re. file(1): unfortunately, this tool - although it has a --files-from option -
does
not allow Zero-separated input. For the search case, it would also come handy
if
file(1) would have a --filter=PATTERN option, and furthermore allow to only
print
the file name matching the pattern for safe post-processing in other tools.
Today, one could efficiently and safely use something like this to find files
where file(1) returns a magic string matching PATTERN :
$ find ... -exec file -00 '{}' + \
| sed -nz 'h;n; /PATTERN/{g;p}' \
| xargs -0 my_command
Here's an example to filter on regular files smaller than 40000 bytes, then
letting
the "file ...|sed ..." pipe filter the wanted magic string "C source", and
finally
continue the search in a subsequent find(1) command.
$ find -type f -size -40000c -mtime -1 -exec file -00 '{}' + \
| sed -nz 'h;n;/^C source/{g;p}' \
| find/find -files0-from - -ls
Obviously, the file(1) run is always by far the most expensive part, because it
has to read all the files, but at least it is only spawned as less as possible,
which hence saves the number of times the magic file has to be loaded.
Have a nice day,
Berny
Re: [bug #64253] Suggestion - Add support for libmagic and xattr,
Bernhard Voelker <=