bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#61386: [PATCH] cp,mv,install: Disable sparse copy on macOS


From: Pádraig Brady
Subject: bug#61386: [PATCH] cp,mv,install: Disable sparse copy on macOS
Date: Tue, 7 Mar 2023 00:31:33 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Thunderbird/109.0

On 06/03/2023 23:48, Paul Eggert wrote:
On 3/6/23 03:13, Pádraig Brady wrote:
On 06/03/2023 07:37, Paul Eggert wrote:
The lseek /dev/null issue was only in your undef SEEK_DATA test patch,
and already addressed in your final gnulib patch in the area, as
discussed at:
https://lists.gnu.org/archive/html/bug-coreutils/2023-02/msg00081.html

Sorry,  I got confused about that issue versus other 'split' issues.


Also immediately rejecting input where we can't determine the size is a
feature.
I.e. the following is the expected behavior:

    $ : | split -n l/1
    split: -: cannot determine file size

But 'split' can easily determine the size there: it's zero. 'split'
doesn't need to use lseek to do that; a single 'read' will do.


With the changes we now have:

    $ : | split -n l/1  # Creates an empty file
    $ yes | split -n l/1
    split: -: cannot determine file size

This is inconsistent, and an insidious issue that users may introduce to
scripts,
that will only fail once input data hits a certain size.

That's OK, as lots of standard utilities already have that issue and
users are OK with this. Users can't reasonably expect 'split -n' to work
on infinitely-sized input such as the second example. At some point
there is a limit.

It's a bit like 'sort' if you feed 'sort' longer and longer input lines,
eventually it will fail and give you a diagnostic.

It'd be nicer if the limit of 'split' were larger than what's in current
Git. If you'd prefer that we raise the limit further, by copying stdin
into a temporary file first, I can write a patch to do that.

Also there are a few `make syntax-check` issues in the new split code.

Ouch, sorry, I'm always forgetting that. Fixed by the attached patch.
(Two of the issues were in 'split', one was elsewhere.)

syntax check now looks good thanks.

Would it be possible to revert this change in isolation?

We could revert the patch's effect (not sure if it's a simple revert,
but I could easily generate such a patch). I'm hoping, though, that we
can reach consensus on extending split's functionality instead, perhaps
along the abovementioned lines.

[dropping bug-gnulib since this is no longer relevant to Gnulib.]

I see this as a more fundamental operational thing than a limit issue.
`split -n` needing the size up front alludes to the operation,
so it's obvious that `yes | split -n l/4 ...` doesn't do round robin for example
and fails immediately.

Also the main point of only supporting a small amount of data
before failing is a bad gotcha for users.

So options I see are to persist implicitly with a temp file patch,
or leave as is and have users explicitly persist the data to split.
Advantages of leaving as is, is it's obvious to users that
all data needs to be read before splitting starts.
Also given split is often used with large data,
explicit control is often desired over where and how it's persisted.

So yes a temp file patch would be better / required,
but I'm thinking a revert would still be better again.

cheers,
Pádraig.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]