bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFE: "-mtype" + -menc


From: Assaf Gordon
Subject: Re: RFE: "-mtype" + -menc
Date: Tue, 7 May 2019 16:49:41 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1

Hello,

On 2019-05-06 5:20 p.m., L A Walsh wrote:
[...]
Once you've narrowed down things to the type of file, you might
be able to select a content-specific tool for some specific field.
[...]
There are a couple of different levels of meta info, but I don't
really see any thing on unix/linux to even tell file type

As mentioned elsewhere, the file(1) program is extremely capable
in deducing types based on files content.

It is easy to combine find(1)+file(1) to filter only specific files
(based on attributes AND content), then act upon them with xargs.

A simplified example:

    find -type f \
       | xargs -d'\n' -r file -Ni \
       | grep ': audio/mpeg' \
       | cut -d: -f1 \
       | xargs -n1 -d'\n' PROGRAM

The above file scan all files, execute file(1) on them,
and if the detected mimetype is 'audio/mpeg', it
will execute PROGRAM on them.

(see below for handling filenames with problematic characters).

----

Once you know these are mp3 files, you can use other programs
to extract meta-data and filter based on that:

    find -type f  \
      | xargs -d'\n' -r file -Ni \
      | grep ': audio/mpeg' \
      | cut -d: -f1 \
      | xargs -d'\n' exiftool -p '$FileName:::$Genre' \
      | awk -F::: 'tolower($2) == "bluegrass" { print $1}'

This produces a list of MP3 files whose 'Genre' meta-data tag is "bluegrass"

This can go on-and-on by piping the output to "xargs" again,
extracting more information and filtering further.

----

To truly handle all filenames, additional options are needed to produce NUL-terminated filenames.
This can be done using the following incantation:

  find [DIRECTORY] -type f -print0 \
       | xargs -0r \
            file --raw --no-buffer --no-pad \
                 --mime-type --print0 --print0 \
       | sed -zn 'h;n;/application\/x-archive/{x;p}' \
       | xargs -0 -n1 echo == processing file:

For explanation about it see:
https://lists.gnu.org/archive/html/coreutils/2019-05/msg00010.html

For additional option of running multiple "file(1)" processes in
parallel, see:
https://lists.gnu.org/archive/html/coreutils/2019-05/msg00008.html

----

As James previously wrote, the one-tool-one-job works very well here.


regards,
 - assaf




reply via email to

[Prev in Thread] Current Thread [Next in Thread]