[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: small ascii files can be sparse
From: |
Paul Eggert |
Subject: |
Re: small ascii files can be sparse |
Date: |
Fri, 27 Jul 2012 12:29:27 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 |
On 07/27/2012 07:36 AM, Martin Carroll wrote:
> a "used" value of 0 for small ascii files is technically within spec
That's not clear. The NFSv3 spec surely does not
not grant permission to the server to (say) report a
used count of zero at all times, claiming that this is
technically within spec.
But you're right that 'grep' should interoperate with
these servers, so I pushed the following patch into the
grep master. It'd be nice to generalize this to other apps
but that's a bigger project.
Thanks for the bug report.
>From 2f0255e9f4cc5cc8bd619d1f217902eb29b30bc2 Mon Sep 17 00:00:00 2001
From: Paul Eggert <address@hidden>
Date: Fri, 27 Jul 2012 12:14:14 -0700
Subject: [PATCH] grep: don't falsely report tiny text files as binary
* NEWS: Document this.
* src/main.c (file_is_binary): When we are already at apparent
EOF, skip the file-size check, as some servers use zero blocks
to store binary files. Reported by Martin Carroll in
<http://lists.gnu.org/archive/html/bug-grep/2012-07/msg00016.html>.
---
NEWS | 5 +++++
src/main.c | 17 ++++++++++++-----
2 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/NEWS b/NEWS
index c7922ff..753aedc 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,11 @@ GNU grep NEWS -*- outline
-*-
* Noteworthy changes in release ?.? (????-??-??) [?]
+** Bug fixes
+
+ 'grep' no longer falsely reports tiny text files as being binary
+ on file systems that store tiny files' contents in metadata.
+
* Noteworthy changes in release 2.13 (2012-07-04) [stable]
diff --git a/src/main.c b/src/main.c
index dda7c9b..96e4f37 100644
--- a/src/main.c
+++ b/src/main.c
@@ -476,11 +476,18 @@ file_is_binary (char const *buf, size_t bufsize, int fd,
struct stat const *st)
represent its data, then it must have at least one hole. */
if (HAVE_STRUCT_STAT_ST_BLOCKS)
{
- off_t nonzeros_needed = st->st_size - cur + bufsize;
- off_t full_blocks = nonzeros_needed / ST_NBLOCKSIZE;
- int partial_block = 0 < nonzeros_needed % ST_NBLOCKSIZE;
- if (ST_NBLOCKS (*st) < full_blocks + partial_block)
- return 1;
+ /* Some servers store tiny files using zero blocks, so skip
+ this check at apparent EOF, to avoid falsely reporting
+ that a tiny zero-block file is binary. */
+ off_t not_yet_read = st->st_size - cur;
+ if (0 < not_yet_read)
+ {
+ off_t nonzeros_needed = not_yet_read + bufsize;
+ off_t full_blocks = nonzeros_needed / ST_NBLOCKSIZE;
+ int partial_block = 0 < nonzeros_needed % ST_NBLOCKSIZE;
+ if (ST_NBLOCKS (*st) < full_blocks + partial_block)
+ return 1;
+ }
}
/* Look for a hole after the current location. */
--
1.7.6.5
- small ascii files can be sparse, Martin Carroll, 2012/07/27
- Re: small ascii files can be sparse,
Paul Eggert <=
- Re: small ascii files can be sparse, Paul Eggert, 2012/07/27
- Re: small ascii files can be sparse, Ilija Hadzic, 2012/07/28
- SEEK_HOLE defined but useless on linux-3.4+/ext4 [Re: small ascii files can be sparse, Jim Meyering, 2012/07/30
- Re: SEEK_HOLE defined but useless on linux-3.4+/ext4 [Re: small ascii files can be sparse, Eric Blake, 2012/07/30
- Re: SEEK_HOLE defined but useless on linux-3.4+/ext4 [Re: small ascii files can be sparse, Jim Meyering, 2012/07/30
- Re: SEEK_HOLE defined but useless on linux-3.4+/ext4 [Re: small ascii files can be sparse, Paul Eggert, 2012/07/30
- Re: SEEK_HOLE defined but useless on linux-3.4+/ext4 [Re: small ascii files can be sparse, Jim Meyering, 2012/07/31
- Re: SEEK_HOLE defined but useless on linux-3.4+/ext4 [Re: small ascii files can be sparse, Paul Eggert, 2012/07/31