|
From: | GNU bug Tracking System |
Subject: | [debbugs-tracker] bug#25749: closed (grep 3.0 skips "binary" lines in ssconvert output) |
Date: | Thu, 16 Feb 2017 07:12:02 +0000 |
Your message dated Wed, 15 Feb 2017 23:11:04 -0800 with message-id <address@hidden> and subject line Re: bug#25749: grep 3.0 skips "binary" lines in ssconvert output has caused the debbugs.gnu.org bug report #25749, regarding grep 3.0 skips "binary" lines in ssconvert output to be marked as done. (If you believe you have received this mail in error, please contact address@hidden) -- 25749: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=25749 GNU Bug Tracking System Contact address@hidden with problems
--- Begin Message ---Subject: grep 3.0 skips "binary" lines in ssconvert output Date: Wed, 15 Feb 2017 22:36:36 -0600 Dear Madam or Sir, That problem almost ruined my work today. I made the following note to myself but you might be also interested: === current grep (2.25) is much faster than 2.5.4 from Lucid but SKIPS "binary" lines in ssconvert output, freshly compiled grep 3.0 skips less but still does it. Workaround: look for "binary match" phrase in the end of file and apply grep -a. Report to https://www.gnu.org/software/grep/manual/html_node/Reporting-Bugs.html ? === The file of question (gzipped) is attached. My system: === $ uname -a Linux ... 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux === Commands which reproduce the problem: === grep . usa-format.txt > 1 grep -a . usa-format.txt > 2 diff 1 2 === Again, the problem exists with both Ubuntu Xenial default grep 2.25 and new grep 3.0 With best wishes, Alexey Shipunovusa-format.txt.gz
Description: GNU Zip compressed data
--- End Message ---
--- Begin Message ---Subject: Re: bug#25749: grep 3.0 skips "binary" lines in ssconvert output Date: Wed, 15 Feb 2017 23:11:04 -0800 When I tried to read that attachment, gedit complained "There was a problem opening" it, and then "The file you opened has some invalid characters. If you continue editing this file you could corrupt this document. You can also choose another character encoding and try again." So it is not only "grep" that is having problems with the file. User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 Looking into it further, the file contains a non-text byte in line 13676, in the string "address@hidden W OF RALEIGH", where the "@" denotes a byte with octal value 233. This is invalid UTF-8 text. You can work around the issue by replacing the non-text byte with a valid character, or by using "grep -a" as you noted, or by setting the LC_ALL environment variable to "C", or by using a grep pattern that does not match the non-text line.
--- End Message ---
[Prev in Thread] | Current Thread | [Next in Thread] |