|
From: | Thomas Wolff |
Subject: | bug#19242: latest grep considers text files as binary |
Date: | Fri, 05 Dec 2014 10:58:49 +0100 |
User-agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 |
Paul Eggert wrote:
I was aware of this workaround but I claim it should not be needed because the files affected are in fact not binary files but text files. The manual clearly says about -a: "Process a binary file as if it were text" but partial content in a different text encoding does not make a file binary.the mentioned patches are apparently intended to fix issues in non-UTF-8 locales.No, they're also needed for UTF-8 locales I'm afraid. There are some security issues, not only having to do with grep's internals, but also for the behavior of downstream programs that may be expecting UTF-8 text.You can work around the problem with 'grep -a'.
Jim Meyering wrote:
I deny this is desirable behavior and I doubt there is a security issue as described. If any other, independent software has a security issue with non-UTF-8 input, it should decide itself to filter it and use accordingly stable decoding functions. It cannot be the task of any tool (grep in this case) to filter output to work around possible security issues in other programs in a pipe. This would be completely against the concept of pipes in the Unix tradition.this is due to documented and desirable behavior.
Honestly I think this is another case of practical usefulness losing against dogma in software design.
Kind regards, Thomas
[Prev in Thread] | Current Thread | [Next in Thread] |