bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#28255: grep erroneously skips Microsoft UTF-8 text files as being bi


From: Paul Eggert
Subject: bug#28255: grep erroneously skips Microsoft UTF-8 text files as being binary
Date: Sun, 27 Aug 2017 14:47:28 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

Simon wrote:
Windows text files can start with a byte order mark of U+FEFF and then
be encoded in UTF-8.  These are skipped as being binary files.

I can't reproduce this problem on Fedora 26 x86-64. Here's how I tried:

$ printf '\357\273\277x\n' >t
$ LC_ALL=C grep x t | od -c
0000000 357 273 277   x  \n
0000005

To help us diagnose the problem, please send a simple, self-contained example, and mention your platform.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]