bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bugs in dirname module


From: Eric Blake
Subject: Re: bugs in dirname module
Date: Thu, 17 Nov 2005 06:28:09 -0700
User-agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Paul Eggert on 11/16/2005 1:50 PM:
> 
> I'm trying to map this to the bigger picture, and coming up empty.
> Among other things, I don't like having "" be a special case.

My only question on this front is whether base_name("") would then return
"." (like POSIX basename), or "".  I would argue that it return "", so
that we don't create a valid basename out of an invalid filename.

> 
> How about the following idea instead?  Let's omit abase_name and just
> have base_name.  Then we use this rule:
> 
>   If you have a valid file name F, then accessing F is the same as
>   chdir(dir_name(F)) followed by accessing base_name(F).  Furthermore
>   if you successfully { chdir(dir_name(F)); rename(base_name(F),"foo"); },
>   you have renamed F to a file named "foo" in the same directory that F was 
> in.

I like that formulation of the rule, once you overlook the memory leaks
that are there to make the example shorter.

> 
> If we go this route, then base_name(F) cannot in general yield a
> suffix of F even on Unix systems, since we would want dir_name("a/b/")
> == "a/b" and base_name("a/b/") == ".".  Hence base_name will need to
> allocate memory in general, even on Unix.  On Cygwin it will need it
> to compute "./a:b".

I like it - I will go ahead and implement base_name to always malloc()
memory, such that accessing the result always has the desired semantics;
then propose followup patches to coreutils, tar, and findutils to
accomodate this change in semantics.  Also, I think it may be reasonable
to get rid of base_len - its current semantics is to return the length of
stripping the trailing slashes from the result a non-allocating base_name,
but with new base_name semantics, the trailing slashes are already
stripped and strlen() does the same job.

Is it worth adding a last_component(name) function that does what the old
base_name did (that is, return a pointer within the passed argument of the
location of the first character of the last relative filename component)?
   Actually, if I did that, then we are back to the question of whether
last_component("/") should be "/" or "", although I would argue that here,
"" makes more sense.  On the other hand, you can get the same behavior of
the in-string pointer to the last component by doing name+dir_len(name),
then advancing past leading slashes.

> 
> Also, src/dirname.c and src/basename.c will have to be modified to
> strip redundant trailing slashes before invoking dir_name and
> base_name.

Actually, I think that calling base_name should strip the trailing
slashes, similar to POSIX basename().  The only reason the old semantics
couldn't do that is because they couldn't modify the string.

- --
Life is short - so eat dessert first!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDfIVo84KuGfSFAYARAulPAJ92jbwzTgwOOQ12n5PvtzQZTm2e1ACgtqoZ
IP+DJjPn9Ce/xnchPFhXmMw=
=xK+J
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]