[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: split behavior
From: |
Pádraig Brady |
Subject: |
Re: split behavior |
Date: |
Mon, 14 Sep 2009 22:31:36 +0100 |
User-agent: |
Thunderbird 2.0.0.6 (X11/20071008) |
Pádraig Brady wrote:
> Roger McNichols wrote:
>> I found a machine with the old version of split.
>>
>> home:~> uname -a
>> Linux home 2.2.13 #4 Thu May 8 23:11:31 CDT 2003 i686 unknown
>> home:~>
>> home:~> split --version
>> split (GNU textutils) 1.22
>> home:~>
>>
>>
>> Here's the result of
>> home:~> cat /var/log/messages | split -2 - /tmp/x.
>>
>> not exactly as I recalled. instead of adding zz first time, adds za but ends
>> with yz,
>> then starts adding zz... Anyway:
>>
>> x.aa
>> x.ab
>> ...
>> x.yz
>> x.zaaa
>> x.zaab
>> ...
>> x.zyzz
>> x.zzaaaa
>> x.zzaaab
>
> Interesting. I can confirm that textutils-1.22 behaves as above.
> http://ftp.gnu.org/old-gnu/textutils/textutils-1.22.tar.gz
>
> I'll have a look later this evening to see when/why this changed.
The -a option and the fixed length suffix behaviour was added
in 2002 (2.0.21) so as to conform to POSIX:
http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commit;h=65cbf7d1
So you'll be able to get the old behaviour by using split from:
http://alpha.gnu.org/gnu/coreutils/textutils-2.0.20.tar.bz2
POSIX seems to only consider fixed length suffixes, saying:
split [-l line_count] [-a suffix_length] [file[name]]
split -b n[k|m] [-a suffix_length] [file[name]]
The suffix shall consist of exactly suffix_length lowercase letters
By default, the names of the output files shall be 'x', followed
by a two-character suffix from the character set as described
above, starting with "aa", "ab", "ac", and so on, and continuing
until the suffix "zz", for a maximum of 676 files. If the number
of files required exceeds the maximum allowed by the suffix
length provided, the split utility shall fail
The -a option was added to overcome the limitation of being able
to create only 676 files.
The last statement is ironic in this context. I would think that
the old behaviour is still desirable if -a was not specified and
POSIXLY_CORRECT was not set?
cheers,
Pádraig.
Re: split behavior, Pádraig Brady, 2009/09/11