coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] copy: adjust fiemap handling of sparse files


From: Jim Meyering
Subject: Re: [PATCH] copy: adjust fiemap handling of sparse files
Date: Thu, 10 Feb 2011 11:53:38 +0100

Pádraig Brady wrote:
> On 09/02/11 23:35, Pádraig Brady wrote:
>>>> Subject: [PATCH] copy: adjust fiemap handling of sparse files
>>>>
>>>> Don't depend on heuristics to detect sparse files
>>>> if fiemap is available.  Also don't introduce new
>>>> holes unless --sparse=always has been specified.
>
>>>> * src/copy.c (extent_copy): Pass the user specified
>>>> sparse mode, and handle as described above.
>>>> Also a redundant lseek has been suppressed when
>>>> there is no hole between two extents.
>>>
>>> Could that be done in two separate patches?
>
> Here is the first patch split out to suppress redundant lseek()s.
> It's expanded now to suppress lseek in the source as well as dest.
>
> $ fallocate -l $((1<<29)) /tmp/t.f
> $ strace -e _llseek cp-before /tmp/t.f /dev/null 2>&1 | wc -l
> 180
> $ strace -e _llseek cp /tmp/t.f /dev/null 2>&1 | wc -l
> 0
>
> Hmm, the above suggests to me that we could use the
> FIEMAP_EXTENT_UNWRITTEN flag, so as to skip the read()
> for these allocated but unwritten (zero) extents.
> We could treat such extents like holes, except we
> would only generate holes in the dest with SPARSE_ALWAYS.
> I'll do patch for that this evening.
>
> Anyway the following small reorg will also simplify
> possible future changes to introduce fallocate().
>
> Note the attached mostly changes indenting,
> with the significant change being just:
>
> $ git diff -w --no-ext-diff HEAD~
>
> diff --git a/src/copy.c b/src/copy.c
> index 9182c16..5b6ffe3 100644
> --- a/src/copy.c
> +++ b/src/copy.c
> @@ -334,7 +334,10 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t 
> buf_size,
>          {
>            off_t ext_start = scan.ext_info[i].ext_logical;
>            uint64_t ext_len = scan.ext_info[i].ext_length;
> +          uint64_t hole_size = ext_start - last_ext_start - last_ext_len;
>
> +          if (hole_size)
> +            {
>            if (lseek (src_fd, ext_start, SEEK_SET) < 0)
>              {
>                error (0, errno, _("cannot lseek %s"), quote (src_name));
> @@ -356,11 +359,6 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t 
> buf_size,
>                /* When not inducing holes and when there is a hole between
>                   the end of the previous extent and the beginning of the
>                   current one, write zeros to the destination file.  */
> -              if (last_ext_start + last_ext_len < ext_start)
> -                {
> -                  uint64_t hole_size = (ext_start
> -                                        - last_ext_start
> -                                        - last_ext_len);
>                    if (! write_zeros (dest_fd, hole_size))
>                      {
>                        error (0, errno, _("%s: write failed"), quote 
> (dst_name));

Thanks!  that does the job and passes the tests.

While using --sparse=always is a lot less common,
it looks like there's room for improvement there, too.
I.e., we should be able to eliminate all of these
lseek calls on the output descriptor there, too:

    $ strace -e lseek cp --sparse=always /tmp/t.f /tmp/t2 2>&1|wc -l
    16384



reply via email to

[Prev in Thread] Current Thread [Next in Thread]