bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SEEK_HOLE defined but useless on linux-3.4+/ext4 [Re: small ascii fi


From: Jim Meyering
Subject: Re: SEEK_HOLE defined but useless on linux-3.4+/ext4 [Re: small ascii files can be sparse
Date: Wed, 01 Aug 2012 13:27:40 +0200

Jim Meyering wrote:

> Jim Meyering wrote:
>> Paul Eggert wrote:
>>> On further thought, the heuristic is also incorrect for file
>>> systems that compress their data.  So I installed this further
>>> patch.
>>>
>>> Oh, well.  At least the code is simpler now.  Simple and slow
>>> is better than complicated and fast and occasionally wrong.
>> ...
>>> Subject: [PATCH] grep: don't falsely report compressed text files as binary
>>>
>>> * NEWS: Document this.
>>> * src/main.c (file_is_binary): Remove the heuristic based on
>>> st_blocks, as it does not work for compressed file systems.
>>> On Solaris, it'd be cheap to test whether the file system is known
>>> to be uncompressed, which allow the heuristic, but Solaris has
>>> SEEK_HOLE so there's little point.
>>
>> Hi Paul,
>>
>> Without the st_blocks-based heuristic, grep's big-hole test now fails
>> (exhausts memory and exits with status 2) on an ext4 file system with
>> a recent linux kernel.
>> That happens because while SEEK_HOLE and SEEK_DATA are now defined,
>> the kernel's ext4 lseek/SEEK_HOLE support is just a stub that simply
>> returns the length of the file.
>>
>> For the record, the SEEK_HOLE support for btrfs and xfs in
>> linux-3.4.4 (F17) works the way I would expect, and it looks
>> like ocfs2 is fine, too.
>>
>> Here's a demo:
>>
>> SEEK_HOLE works (detects the hole) with btrfs (SEEK_HOLE == 4):
>>
>>     $ perl -e '$f=*STDERR; sysseek($f,2**22,0); syswrite($f,"a");' \
>>       -e 'print 0+sysseek($f,0,4)' 2> j; stat -f --fo=\ %T .
>>     0 btrfs
>>
>> SEEK_HOLE is not usable (reports "hole" at EOF) with ext4:
>> stat -f report ext2/ext3, but that's only looking at the magic number.
>> It's really ext4:
>>
>>     $ perl -e '$f=*STDERR; sysseek($f,2**22,0); syswrite($f,"a");' \
>>       -e 'print 0+sysseek($f,0,4)' 2> j; stat -f --fo=\ %T .
>>     4194305 ext2/ext3
>>
>> tmpfs uses the same code,
>>
>>     4194305 tmpfs
>
> A quick update:
> At least with recent linux kernels (3.5.0+), tmpfs now does
> have SEEK_HOLE support.  Confirmed on fedora rawhide.
> Thanks to Jeff Layton for the tip.

Not quite.
It was added, but then removed at the last minute.
Hence it is in rawhide's 3.5.0-0.rc6.git0.2.fc18.x86_64 kernel,
but removed before 3.5.0:

  http://thread.gmane.org/gmane.linux.kernel.mm/82183/focus=82185



reply via email to

[Prev in Thread] Current Thread [Next in Thread]