bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: uuencode: multi-bytes char in remote file name contains bytes >0x80


From: Eric Blake
Subject: Re: uuencode: multi-bytes char in remote file name contains bytes >0x80
Date: Tue, 05 Jul 2011 09:58:26 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.10

On 07/05/2011 09:45 AM, John Cowan wrote:
> Eric Blake scripsit:
> 
>> [B]ut if filename is _not_ a character string in the current locale, then
>> the output would _not_ be a text file (among other things, a text file
>> has the property that at least one locale can interpret every byte
>> sequence in the file as valid characters).  
> 
> Say what?  The name of a file is not a byte sequence in the file.  I
> don't see how it follows that because a file is a text file, its name
> is a character string in some locale.

When used according to POSIX, the 'decode_pathname' argument (POSIX
notation, or REMOTEFILE argument in 'uuencode --help' notation) is
output literally in the resulting output of 'uuencode' on the line
starting with "begin"; that resulting output is also required by POSIX
to be a text file.  It also helps to read elsewhere in the POSIX
requirements on uuencode: "If there are characters in decode_pathname
that are not in the portable filename character set the results are
unspecified."  Therefore, you _cannot_ use uuencode to pass the name of
a file that contains non-portable characters and still have output that
complies with POSIX.

Which means that for our particular implementation of uuencode, if we
encounter a file name that contains any bytes not already in the
portable file name set, then we can do whatever we want (error out, or
output some sort of prefix line that tells knowledgeable uudecode
implementations that we are about to send an encoded form of a file
name, output a binary file rather than a text file [by outputting the
file name as a literal sequence of bytes, even though those bytes are
not characters in the current locale], or anything else), all as an
extension to POSIX.  Of course, our goal should be to have the
out-of-the-box behavior provide the most likely use (that is, it would
be better if we could just make uuencode work on all possible file
names, even on the ones where POSIX does not require any particular
behavior).

-- 
Eric Blake   address@hidden    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]