bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#22001: Is it possible to tab separate concatenated files?


From: Eric Blake
Subject: bug#22001: Is it possible to tab separate concatenated files?
Date: Thu, 26 Nov 2015 20:28:13 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 11/26/2015 04:52 PM, Linda Walsh wrote:

>> Because every plain
>> text line in a file must be terminated with a newline.
> ----
>    That's only a recent POSIX definition.  It's not related to
> real life.  When I looked for a text file definition on google, nothing
> was mentioned about needing a newline on the last line -- except on
> 1 site -- and that site was clearly not talking about 'text' files, but
> Unix-text-record files w/each record terminated by a NL char.
> 

Quit spreading FUD about POSIX.  That definition of text file is NOT a
recent invention; even back in POSIX 2001 the definition read:

3.392 Text File

A file that contains characters organized into one or more lines. The
lines do not contain NUL characters and none can exceed {LINE_MAX} bytes
in length, including the <newline>. Although IEEE Std 1003.1-2001 does
not distinguish between text files and binary files (see the ISO C
standard), many utilities only produce predictable or meaningful output
when operating on text files. The standard utilities that have such
restrictions always specify "text files" in their STDIN or INPUT FILES
sections.
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap03.html

That was POSIX Issue 6; the more recent POSIX Issue 7 corrected the
definition to also allow a completely empty file to be considered as a
text file.  But the point is that POSIX has always required a text file
to end in a newline.

>    On a mac, txt files have records separated by 'CR', and on DOS/Win,
> txt files have txt records separated by CRLF.

And those systems aren't POSIX.  So they aren't relevant to a discussion
about POSIX.


>> Why isn't there a newline at the end of the file?  Fix that and all of
>> your problems and many others go away.
>>   
> ---
>    Didn't used to be a requirement -- it was added because of a broken
> interpretation of the posix standard.  Please remember that a a posixified
> definition of 'X' (for any X), may not be the same as a real-live 'X'.

No, it has ALWAYS been a problem.  Even 40 years ago, before POSIX was
invented, the only PORTABLE way to use programs like sed was to use it
on text files - namely, files where no line exceeded LINE_MAX bytes,
where no lines contained NUL bytes, and where ALL lines ended in
newline.  Because there were vendor implementations of sed (not GNU
coreutils, mind you, but other vendors) that really were hardcoded to
some rather small limits, and understandably so in a day when computers
did not have as much memory as they do today.  POSIX just standardized
existing practice on what formed a text file, when it came to existing
Unix systems at that time.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]