bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#6131: [PATCH]: fiemap support for efficient sparse file copy


From: jeff.liu
Subject: bug#6131: [PATCH]: fiemap support for efficient sparse file copy
Date: Wed, 09 Jun 2010 22:46:20 +0800
User-agent: Thunderbird 2.0.0.14 (X11/20080505)

Jim Meyering wrote:
> Jim Meyering wrote:
>> Subject: [PATCH 01/10] cp: Add FIEMAP support for efficient sparse file copy
> 
> FYI, using those patches, I ran a test for the first time in a few days:
> 
>     check -C tests TESTS=cp/sparse-fiemap VERBOSE=yes
> 
> It failed like this on an ext4 partition using F13:
> 
>     + timeout 10 cp --sparse=always sparse fiemap
>     + fail=1
>     ++ stat --printf %s sparse
>     ++ stat --printf %s fiemap
>     + test 1099511628800 = 0
>     + fail=1
> 
> That is very odd.  No diagnostic from cp, yet it failed
> after creating a zero-length file.
> 
> Here's the corresponding piece of the script:
> 
>     # It takes many minutes to copy this sparse file using the old method.
>     # By contrast, it takes far less than 1 second using FIEMAP-copy.
>     timeout 10 cp --sparse=always sparse fiemap || fail=1
> 
>     # Ensure that the sparse file copied through fiemap has the same size
>     # in bytes as the original.
>     test $(stat --printf %s sparse) = $(stat --printf %s fiemap) || fail=1
> 
> However, so far I've been unable to reproduce the failure,
> running hundreds of iterations:
> 
>     for i in $(seq 300); do printf .; make check -C tests \
>       TESTS=cp/sparse-fiemap VERBOSE=yes >& makerr-$i || break; done
> 
> Have any of you heard of a problem whereby a cold cache can cause
> such a thing?  "echo 3 > /proc/sys/vm/drop_caches" didn't help.
Hi Jim,

Have you run `sync' before clean the buffer and caches?  Actually, even run 
`sync' first, sometimes,
maybe the dirty objects still can not be freed in some cases. :(

I can reproduce this issue on ext4 and btrfs(physical mounted partition) or 
just run the
sparse-fiemap test script, ocfs2 always works fine in this case.

I guess this issue might caused by the 'cold cache' as your above mentioned.
According to my tryout, after clean out the caches, cp via filemap always works 
in my test
environment, otherwise, it failed from time to time.

My kernel version:
Linux jeff-laptop 2.6.33-rc5-00238-gb04da8b-dirty #11 SMP Sat Dec 19 22:02:01 
CST 2009 i686 GNU/Linux

address@hidden:/ext4$ dd if=/dev/zero of=sparse bs=1k count=1 seek=1G
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 0.000156654 s, 6.5 MB/s
address@hidden:/ext4$ ls -l sparse
-rw-r--r-- 1 jeff jeff 1099511628800 Jun  9 22:21 sparse
address@hidden:/ext4$ filefrag sparse
sparse: 0 extents found
address@hidden:/ext4$ filefrag -v sparse
Filesystem type is: ef53
File size of sparse is 1099511628800 (268435457 blocks, blocksize 4096)
 ext  logical physical expected length flags
sparse: 1 extent found

To free the buffer cache:
=========================
address@hidden:/ext4$ free
             total       used       free     shared    buffers     cached
Mem:       1980300     719972    1260328          0       2836      94104
-/+ buffers/cache:     623032    1357268
Swap:            0          0          0
address@hidden:/ext4$ sync

In another root console, run 'echo 3 > /proc/sys/vm/drop_caches'

address@hidden:/ext4$ free
             total       used       free     shared    buffers     cached
Mem:       1980300     716780    1263520          0       1184      88592   
<<<<<-----freed
-/+ buffers/cache:     627004    1353296
Swap:            0          0          0

address@hidden:/ext4$ filefrag -v sparse
Filesystem type is: ef53
File size of sparse is 1099511628800 (268435457 blocks, blocksize 4096)
 ext  logical physical expected length flags
   0 268435456    32999               1 eof
sparse: 2 extents found

address@hidden:/ext4$ ./cp --sparse=always sparse f1
last_ext_logical 1099511627776 last_read_size 1024 src_total_size 1099511628800
address@hidden:/ext4$ filefrag -v f1
Filesystem type is: ef53
File size of f1 is 1099511628800 (268435457 blocks, blocksize 4096)
 ext  logical physical expected length flags
   0 268435456   296960               1 eof
f1: 2 extents found


address@hidden:/ext4$ ./cp --sparse=always sparse f2
last_ext_logical 1099511627776 last_read_size 1024 src_total_size 1099511628800

address@hidden:/ext4$ filefrag -v f2
Filesystem type is: ef53
File size of f2 is 1099511628800 (268435457 blocks, blocksize 4096)
 ext  logical physical expected length flags
f2: 1 extent found

address@hidden:/ext4$ sync and 'clean memory via /proc on another root console'

address@hidden:/ext4$ filefrag -v f2
Filesystem type is: ef53
File size of f2 is 1099511628800 (268435457 blocks, blocksize 4096)
 ext  logical physical expected length flags
   0 268435456    33379               1 eof
f2: 2 extents found


I will do a double check for my original patch to ensure this is not a code bug 
for that issue once
get through an urgent task on hand.

Thanks,
-Jeff

> I suspect that having so many extents is unusual, so maybe
> this is a rarely exercised corner case.
> 
> ===============================
> As I wrote the above, I realized I probably had enough
> information to deduce where things were going wrong, even
> if so far I've been unable to reproduce it.
> 
> And sure enough.  There is a way to provoke exactly
> that failure.  If the *second* (or later) FIEMAP ioctl fails:
> 
>   do
>     {
>       fiemap->fm_length = FIEMAP_MAX_OFFSET;
>       fiemap->fm_extent_count = count;
> 
>       /* When ioctl(2) fails, fall back to the normal copy only if it
>          is the first time we met.  */
>       if (ioctl (src_fd, FS_IOC_FIEMAP, fiemap) < 0)
>         {
>           /* If the first ioctl fails, tell the caller that it is
>              ok to proceed with a normal copy.  */
>           if (i == 0)
>             *normal_copy_required = true;
>           return false;
>         }
> 
> In that case, fiemap_copy returns false (with no diagnostic)
> and cp fails silently.
> 
> Obviously I will now add code to diagnose the failure,
> but do any of you know off hand how to reproduce this
> or what the failure might have been?
> 
> Here's the patch I plan to merge:
> 
> diff --git a/src/copy.c b/src/copy.c
> index eb67700..07d605e 100644
> --- a/src/copy.c
> +++ b/src/copy.c
> @@ -200,6 +200,12 @@ fiemap_copy (int src_fd, int dest_fd, size_t buf_size,
>               ok to proceed with a normal copy.  */
>            if (i == 0)
>              *normal_copy_required = true;
> +          else
> +            {
> +              /* If the second or subsequent ioctl fails, diagnose it,
> +                 since it ends up causing the entire copy/cp to fail.  */
> +              error (0, errno, _("%s: FIEMAP ioctl failed"), quote 
> (src_name));
> +            }
>            return false;
>          }


-- 
With Windows 7, Microsoft is asserting legal control over your computer and is 
using this power to
abuse computer users.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]