bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#27666: [grep on GPFS filesystem] SEEK_HOLE problem


From: Eric Blake
Subject: bug#27666: [grep on GPFS filesystem] SEEK_HOLE problem
Date: Thu, 20 Jul 2017 07:46:26 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

On 07/20/2017 04:03 AM, Moyard John wrote:
> Thank your very much for your detailed answer.
> "close(2)" is involved because a test case made to reproduce the problem use 
> a cp, initially a fortran code to make a copy, follow by a grep.
> 
> I clearly understand your point of view about 
>      reporting hole and NUL bytes
>      GPFS incompatibility with others programs/commands that could use 
> SEEK_HOLE
> I try to take a quick look about this last point and don't find yet any 
> system command using it.
> Do you have an example of other command using SEEK_HOLE?

More and more commands are starting to make optimizations based on
SEEK_HOLE.  cp, tar, diff, grep, etc.  Programs like qemu-img REQUIRE a
working SEEK_HOLE for efficiently managing sparse virtual machine disk
images.

> 
> In POSIX point of view, lseek(2) manpage precise this :
> SEEK_DATA and SEEK_HOLE are nonstandard extensions also present in Solaris, 
> FreeBSD, and DragonFly BSD
> They are proposed for inclusion in the next  POSIX  revision   (Issue 8)
> Do you have any information about it?

Here's the proposed POSIX wording:
http://austingroupbugs.net/view.php?id=415

Requiring close() to occur before SEEK_HOLE is accurate is a bug in GPFS
(if any other process can read() non-zero data but lseek(SEEK_HOLE)
still claims that section of the file is a hole, then the file system is
buggy, per the wording POSIX will be adding).


> Does compile 'grep' mechanism could avoid the use of SEEK_HOLE test ?

No. Avoiding a buggy SEEK_HOLE in grep won't fix all the other programs
(like cp, tar, diff) that are also negatively impacted by the buggy
SEEK_HOLE.  Fix the GPFS bug, and then all of the user-space apps will
no longer be impacted by the bug.

[By the way, top-posting is frowned on for technical lists].  I agree
with Paul's conclusion:

> Really, GPFS needs to be fixed. If GPFS can't support SEEK_HOLE correctly, it 
> should simply have lseek with SEEK_HOLE go to end-of-file; that will work 
> with 'grep' (albeit more slowly), and is the documented way that SEEK_HOLE is 
> supposed to work on file systems that cannot support SEEK_HOLE directly.
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]