[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#20511: split : does not account for --numeric-suffixes=FROM in calcu
From: |
Ben Rusholme |
Subject: |
bug#20511: split : does not account for --numeric-suffixes=FROM in calculation of suffix length? |
Date: |
Tue, 5 May 2015 21:29:19 -0700 |
Hi,
> The info docs say about the --numeric-suffixes option:
>
> Note specifying a FROM value also disables the default auto suffix
> length expansion described above, and so you may also want to
> specify ‘-a’ to allow suffixes beyond ‘99’.
This does not seem to be the case, auto suffix works fine beyond 99 (in the
current 8.23 release)?
$ seq 1000000 >& input.txt
$ split --numeric-suffixes=1234 --number=l/5678 input.txt
$ ls | tail
x6902
x6903
x6904
x6905
x6906
x6907
x6908
x6909
x6910
x6911
It just fails wherever FROM pushes CHUNKS over a multiple of 10:
$ rm x*
$ split --numeric-suffixes --number=l/10000 input.txt
$ ls | tail -n 3
x9997
x9998
x9999
$
$ rm x*
$ split --numeric-suffixes=1 --number=l/10000 input.txt
split: output file suffixes exhausted
$ ls | tail -n 3
x9997
x9998
x9999
$ ls | head -n 3
input.txt
x0001
x0002
$
$ rm x*
$ split --numeric-suffixes=2 --number=l/9999 input.txt
split: output file suffixes exhausted
$ ls | tail -n 3
x9997
x9998
x9999
$ ls | head -n 3
input.txt
x0002
x0003
As you say, this can always be fixed by the "--suffix-length" argument, but
it’s only required for certain combinations of FROM and CHUNK, (and “split”
already has all the information it needs).
> Now you could bump the suffix length based on the start number,
> though I don't think we should as that would impact on future
> processing (ordering) of the resultant files. I.E. specifying
> a FROM value to --numeric-suffixes should only impact the
> start value, rather than the width.
Could you clarify this for me? Doesn’t the zero-padding ensure correct
processing order? I assume the crucial test is the inverse operation:
$ cat x* >& output.txt
$ diff input.txt output.txt
$
Thanks, Ben