bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#61386: [PATCH] cp,mv,install: Disable sparse copy on macOS


From: George Valkov
Subject: bug#61386: [PATCH] cp,mv,install: Disable sparse copy on macOS
Date: Fri, 10 Feb 2023 22:06:22 +0200

> On 2023-02-10, at 8:58 PM, Pádraig Brady <P@draigBrady.com> wrote:
> 
> On 10/02/2023 17:24, George Valkov wrote:
>>> On 2023-02-10, at 4:02 PM, Pádraig Brady <P@draigBrady.com> wrote:
>>> 
>>> On 10/02/2023 12:13, George Valkov wrote:
>>>>> On 2023-02-10, at 11:18 AM, Pádraig Brady <P@draigBrady.com> wrote:
>>>>> 
>>>>> I'll apply the simple patch later I think.
>>>> I have an interesting idea: If I copy a large file, say the 16 GB disk 
>>>> image
>>>> where I compiled OpenWRT. So I copy this on the same filesystem, and check
>>>> disk usage before an after: no change. Also the copy is instant. That’s 
>>>> because
>>>> APFS supports copy-on-write. Initially both files share the same sectors 
>>>> on disk,
>>>> any changes made after that are copied to new sectors.
>>>> This is the most efficient way to copy files on APFS, and should produce 
>>>> no corruption.
>>>> Let’s implement it. I would assume there is a system cal that does the 
>>>> entire copy,
>>>> And we don’t need to read and write any data.
>>>> Does the trace contain any interesting calls that might be related to that?
>>> 
>>> When you say "I copy a large file", is that with gcp or something else?
>> Finder: option + drag in the same directory
>>> Since coreutils 9.1 we try the CoW by default with fclonefileat()
>>> which is available since macOS 10.12 (2016).
>> I can confirms fclonefileat works on macOS 12.6.3 and solves the issue.
>> File attributes are also preserved. To clarify: these observations are from
>> the sample below. I don’t have internal experience with cloreutils, but I can
>> test it if you provide some description.
>>> Note that works only when src and dest are within the same file system,
>>> but that is the case for you if I'm reading your original report correctly.
>> Correct. fclonefileat only works on the same file system.
>> If we attempt to clone to another volume, fclonefileat fails 18 Cross-device 
>> link.
>> When building OpenWRT everything works on the same file system, so
>> fclonefileat is applicable.
>>> When I mentioned that earlier I thought your macOS version was too old to 
>>> support that,
>>> but in fact your 12.6.3 should support fclonefileat() fine.
>> Thankfully it is supported. The latest security update came in January 2023.
>>> So that's another variable. Is that call failing completely for you?
>>> You might be quicker to add a couple of printfs around the
>>> fclonefileat() calls in the coreutils from latest git you compliled.
>>> Note there is new error handling related to that call in the latest git,
>>> compared to what was released in coreutils 9.1.
>> Coreutils tests on master
>> git clone git://git.savannah.gnu.org/coreutils.git
>> mv coreutils coreutils-clone
>> cd coreutils-clone
>> git submodule foreach git pull origin master
>> ./bootstrap
>> ./configure
>> make -j 16
>> git log
>> commit d374d32ccf12f8cf33c8f823d573498b7c8b27a4 (HEAD -> clone, 
>> origin/master, origin/HEAD, master)
>> cd ..
>> ./coreutils-clone/src/cp cc1 cc1-test
>> printf("HAVE_FCLONEFILEAT %u  USE_XATTR %u\n", HAVE_FCLONEFILEAT, USE_XATTR);
>> HAVE_FCLONEFILEAT 1  USE_XATTR 0
>>       int fc_flags = x->preserve_ownership ? 0 : CLONE_NOOWNERCOPY;
>>              printf(
>>                      "data_copy_required      %u  x->reflink_mode        
>> %u\n"
>>                      "x->preserve_mode        %u  x->preserve_timestamps 
>> %u\n"
>>                      "x->preserve_ownership   %u  CLONE_NOOWNERCOPY      
>> %u\n",
>>                      data_copy_required, x->reflink_mode,
>>                      x->preserve_mode, x->preserve_timestamps,
>>                      x->preserve_ownership, CLONE_NOOWNERCOPY
>>              );
>>       if (data_copy_required && x->reflink_mode
>>           && x->preserve_mode && x->preserve_timestamps
>>           && (x->preserve_ownership || CLONE_NOOWNERCOPY))
>>         {
>>                      int a = fclonefileat (source_desc, dst_dirfd, 
>> dst_relname, fc_flags);
>>                      int e = errno;
>>                      printf("fclonefileat %i %i\n", a, e);
>>           if (a == 0)
>>             goto close_src_desc;
>> data_copy_required      1  x->reflink_mode        1
>> x->preserve_mode        0  x->preserve_timestamps 0
>> x->preserve_ownership   0  CLONE_NOOWNERCOPY      2
>> fclonefileat is not used because x->preserve_mode = 0, 
>> x->preserve_timestamps = 0
>> That design seems wrong. The fact than no one requested to preserve that 
>> data,
>> doesn’t mean preserving it via fclonefileat is a bad thing. I think we should
>> always call fclonefileat and proceed to other means if it fails. It is more 
>> efficient, since
>> it requires no additional space or data to be copied. And always produces a 
>> valid copy.
>>> Note also I don't see any fclonefileat() syscalls in your Finder logs at 
>>> least.
>> Great news: fclonefileat works on the same file system, and if we
>> need to copy outside, then just disable sparse copy according to my patch.
>> One option is to start blindly with fclonefileat, and if that fails, fall 
>> back to
>> normal copy. If you can use the trace to find what Finder is doing, we can 
>> try it.
>> It might bring some improvements. Else, disable sparse and do a regular copy.
>> It was easier for me to write a small sample:
>> // clone.c
>> #include <stdio.h>
>> #include <fcntl.h>
>> #include <unistd.h>
>> #include <sys/clonefile.h>
>> int main(int argc, char ** argv)
>> {
>>      int fd = open("cc1", O_RDONLY);
>>      int dir = open("./", O_RDONLY);
>>      int a = fclonefileat(fd, dir, "cc1-clone", 0);
>>      close(fd);
>>      close(dir);
>>      printf("fd %i  dir %i  fclonefileat %i\n", fd, dir, a);
>>      return 0;
>> }
>> ./coreutils/src/cp cc1 cc1-ori-before
>> gcc clone.c && ./a.out
>> ./coreutils/src/cp cc1 cc1-ori-after
>> ll cc1*
>> -rwxr-xr-x  1 g  staff  27551296 Feb 10 17:35 cc1*
>> -rwxr-xr-x  1 g  staff  27551296 Feb 10 17:35 cc1-clone*
>> -rwxr-xr-x  1 g  staff  27551296 Feb 10 17:39 cc1-ori-after*
>> -rwxr-xr-x  1 g  staff  27551296 Feb 10 17:39 cc1-ori-before*
>> sha1sum
>> 812cb0591b000f56bdf75c0f3f94c622e371b970  cc1
>> 812cb0591b000f56bdf75c0f3f94c622e371b970  cc1-clone
>> 812cb0591b000f56bdf75c0f3f94c622e371b970  cc1-ori-after
>> 7b447132f56f3b549ef62a4780945e82d26f7d2c  cc1-ori-before
>> df -h
>> Filesystem       Size   Used  Avail Capacity iused     ifree %iused  Mounted 
>> on
>> /dev/disk1s1    1.1Ti  1.0Ti   58Gi    95% 2400403 605165920    0%   
>> /System/Volumes/Data
>> # clone a large file 15 GB using fclonefileat
>> ./a.out
>> # copy a large file 15 GB using Finder: option + drag in the same directory
>> # both complete instantly
>> df -h
>> Filesystem       Size   Used  Avail Capacity iused     ifree %iused  Mounted 
>> on
>> /dev/disk1s1    1.1Ti  1.0Ti   58Gi    95% 2400401 605207240    0%   
>> /System/Volumes/Data
>> ll -h /Users/g/vhd/OpenWRT.wrt3200*
>> -rw-r--r--@ 1 g  staff    15G Feb 10 04:06 
>> /Users/g/vhd/OpenWRT.wrt3200-clone.sparseimage
>> -rw-r--r--@ 1 g  staff    15G Feb 10 04:06 
>> /Users/g/vhd/OpenWRT.wrt3200-finder.sparseimage
>> -rw-r--r--@ 1 g  staff    15G Feb 10 04:06 
>> /Users/g/vhd/OpenWRT.wrt3200.sparseimage
> 
> Cool that clears up the fclonefileat() questions.
> Attached is a patch to attempt the clone unless
> we've explicitly specified --no-preserve=mode.
> I was considering "touch"ing the timestamps after also,
> but it's better to just maintain them as we're
> pointing to the same data after all.
> 
> So with this patch we should operate quickly within a file system.
> With your original patch we should operate robustly otherwise.

Thank you, Pádriag! Nice teamwork. I added x-> and Tested-by to your patch.
The copy was instant and accurate. Both apply to cc1 as well as the 15 GB file.
Please let me know when there is a new release, so I can push it to OpenWRT.

Attachment: 0001-copy-on-macOS-try-COW-even-if-not-preserving-mode.patch
Description: Binary data


./coreutils-clone/src/cp cc1 cc1-test
c9b79253a4b690c7be03ebc3187f1f09834f7472  cc1
c9b79253a4b690c7be03ebc3187f1f09834f7472  cc1-test
-rwxr-xr-x    1 g     staff  27551296 Feb 10 21:45 cc1*
-rwxr-xr-x    1 g     staff  27551296 Feb 10 21:45 cc1-test*

Build log in case you feel like fixing unrelated sprintf deprecated warnings 
later
to improve quality.
https://httpstorm.com/share/.openwrt/test/2023-02-06_coreutils-9.1/build-log-d374d32ccf12f8cf33c8f823d573498b7c8b27a4.txt

Cheers!

Georgi Valkov
httpstorm.com
nano RTOS


reply via email to

[Prev in Thread] Current Thread [Next in Thread]