bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFE: allowing "" as a path specification for 'current dir' w/o prepe


From: James Youngman
Subject: Re: RFE: allowing "" as a path specification for 'current dir' w/o prepending './' ?
Date: Sat, 25 Feb 2017 14:30:26 +0000

On Mon, Feb 20, 2017 at 7:52 PM, L A Walsh <address@hidden> wrote:

>
> Would it be possible or not unduly difficult to change
> 'find' to recognize/allow a null path ("") specifically
> to allow find to start at the current directory (much like
> not specifying any paths), BUT also suppress the prepending
> of "./" at the beginning of every result?
>

The current behaviour is deliberate.  It's intended, among other things, to
protect the unwary or inexperienced user against unexpected file names such
as "-f".

I think this is a reasonable precaution even if find supported usage such
as find "".   The problem is that the file names which cause problems are
not just dangerous, they are rare[1] - so casual testing won't detect the
problem which lays in wait.


> Almost every usage of "./" in a find is, at some point,
> followed by a stripping of "./" -- somewhere, in subsequent
> use.
>

If the output is intended to be consumed by other programs then stripping
the ./ is not needed (and as I imply above, sometimes dangerous).  If the
output is intended to be consumed by humans, -printf is likely to give the
control and results needed, as Bernhard already suggested.


> This would allow a backwards compatible way of forcing
> a null-path onto the front of returned values.
>

If users want to omit the "./" then I suggest post-processing the output of
"find -print0" with sed -z 's/^..//'.   If users are using just "find" or
"find -print" then, well, the result is in the general case basically
unfixable since filenames can contain newlines.

For what it's worth, the existing behaviour has been this way for as long
as the source repository records (the first commit I have in the git
repository is from 1996-02-04).   There are also some conceptual
difficulties, such as what would "find -mxxdepth 0" print, or what "find
-printf %H\n" would produce.  However, to be frank those difficulties
aren't the real reason I'm reluctant to support such a behaviour, the real
reason is the one in my first paragraph above.

--- end ---

[1] The fact that Unix tools and interfaces  allow cases such as filenames
like  "hello\nworld" is a general problem with the "everything is a text
file" metaphor.   The metaphor allows great synergies between tools, but I
believe that newline at least should have been forbidden in file names.

The fact that file names such as "-f" are also possible causes problems of
a different kind.  I suppose this could be partly addressed by narrowing
the allowable set of file names (e.g. forbidding "-" as the leading
character).  Then you're just left with potential problems such as "*" or
";" in file names, but it's significantly harder to shoot oneself in the
foot with those.

As things stand today, the "file names are text" idea is not really true.
 (If it were true you would be able to process the output of "find -print"
with "sed" without experiencing problems.)  The POSIX committee acted for a
long time as if they believed it were true (I believe it's one of the
reasons they resisted standardising "-print0").   But newlines inside file
names have been supported for a long time (perhaps always, I'm not sure) in
Unix file names, and they have always caused problems when processing file
names as text.  The problem became still worse with the introduction of
support for non-C locales, since at that point it became possible for a
directory entry and its parent to used names in incompatible encoding
systems, with the effect that the full path name would be process-able only
in the C locale and could not be displayed correctly to the user.
Supporting Unicode from the beginning would have been a good alternative
solution to this, but Unicode didn't exist back then.  In any case,
supporting Unicode from the beginning also has its problems, as Microsoft
discovered when Unicode code points outside the basic multilingual plane
were introduced.

James.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]