[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Incorrect parsing of DOS/Windows paths ??
From: |
Paul Smith |
Subject: |
Incorrect parsing of DOS/Windows paths ?? |
Date: |
Sun, 18 Dec 2016 11:44:52 -0500 |
Hi all (especially Eli! :)).
A bug https://savannah.gnu.org/bugs/?49115 came in about the way we
parse filenames in the read.c:parse_file_seq. There is a loop that's
supposed to chop a string into individual filenames, and each time
through the loop we search for the end of the string like this:
/* There are names left, so find the end of the next name.
Throughout this iteration S points to the start. */
s = p;
p = find_char_unquote (p, stopmap|MAP_VMSCOMMA|MAP_BLANK);
Then if we're parsing DOS paths we check to see if the stopmap contains
a colon and if so, we have to determine if we stopped because of a drive
specifier; the idea, I think, is to support things like this correctly:
C:/foo/bar.o:C:/biz/bar.c
should parse as two paths:
C:/foo/bar.o
C:/biz/bar.c
The code is:
#ifdef HAVE_DOS_PATHS
/* For DOS paths, skip a "C:\..." or a "C:/..." until we find the
first colon which isn't followed by a slash or a backslash.
Note that tokens separated by spaces should be treated as separate
tokens since make doesn't allow path names with spaces */
if (stopmap & MAP_COLON)
while (p != 0 && !ISSPACE (*p) &&
(p[1] == '\\' || p[1] == '/') && isalpha ((unsigned char)p[-1]))
p = find_char_unquote (p + 1, stopmap|MAP_VMSCOMMA|MAP_BLANK);
#endif
As the bug points out the if is clearly broken; it will always be true.
However the content of the if-statement looks weird to me as well; I've
checked and it's been like this almost forever though. We're trying to
find the end of the current path. Why do we keep iterating as long as
there's a colon followed by a slash or backslash?
E.g., from what I can see this will accept the following as a valid,
single pathname:
foo:/bar:\biz
???
Did I misread this code, or is there some reason to accept ":/" and ":\"
in the middle of a path in Windows/DOS that I'm not aware of (I'm not a
guru with Windows filesystems)?
Why wouldn't the correct algorithm be: if we stopped due to a drive
specifier (the pathname starts with "[A-Za-z]:") then look once more
until the next stopchar and then we're done? E.g., I would think it
should look something like:
#ifdef HAVE_DOS_PATHS
/* If we stopped due to a drive specifier, skip it.
Tokens separated by spaces are treated as separate paths since make
doesn't allow path names with spaces */
if (p && p == s+1 && p[0] == ':' && isalpha ((unsigned char)s[0]))
p = find_char_unquote (p+1, stopmap|MAP_VMSCOMMA|MAP_BLANK);
#endif
Note that this doesn't require the drive specifier to be followed by a
slash/backslash: e.g., this:
C:foo.o:C:foo.c
Breaks down as:
C:foo.o
C:foo.c
- Incorrect parsing of DOS/Windows paths ??,
Paul Smith <=