bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Percent Signs in External Commands on Windows


From: David Millis
Subject: Re: [bug-gawk] Percent Signs in External Commands on Windows
Date: Fri, 13 Apr 2012 04:54:18 -0700 (PDT)

Much of this will be remedial for you. But racking my brain, the only way I can 
make sense of your responses is that you're conflating something about the 
Windows innards described below, to the point that you're disregarding what I 
write. Or I'm missing something subtle. Either way this should be informative 
for one or both of us...

I'm starting to get a headache second-guessing whether any of today's messages 
will be understood. Maybe prose instead of explicit code will be clearer if the 
others failed somehow.


Based on the thread you linked, the following equals-delimited section should 
definitely be familiar.

= = = =
--- On Thu, 4/12/12, Eli Zaretskii <address@hidden> wrote:

> think about protecting a wildcard
I should mention also that this is mistaken. CMD doesn't itself expand 
asterisks, as would be expected of shells like bash.

c:\> python -c "import sys; print sys.argv[1]" *
*

The underlying CreateProcess() C function takes two relevant args: a string 
with the path to an exe, and a string with _all_ the args lumped together. It's 
the task of _each_ exe to break its own args up into argv[] chunks. Many exe's 
were compiled to use the variants of the MSVCRT CommandLineToArgvW() func or 
similar API. Some were compiled to use setargv.obj to expand asterisks. They're 
each free to split/unescape their own args as inconsistently as they like.

An exe can even have a custom func that uses GetCommandLine() to disregard 
compile-time decisions altogether and take environment vars or time of day into 
account when making its own argv[].

When any exe uses CommandLineToArgvW, IT is doing one flavor of backslashed 
quote mangling. That leads to the headaches you're thinking of (especially, as 
you discovered, when an MSYS bash shell is unhelpfully pre-parsing everything).
http://blogs.msdn.com/b/oldnewthing/archive/2010/09/17/10063629.aspx
http://msdn.microsoft.com/en-us/library/a1y7w461.aspx
= = = =

So far so good? Now what I was talking about, with that in mind...


The CMD.exe doesn't look very closely at what it's given by CreateProcess(). It 
just crawls its all-args string until is finds /C and chops the string there. 
Then it may or may not chop the first&last quotes. What's left is then handled 
by apparently the same subroutine that parses live prompt input.

c:\> cmd /C /Q ECHO hi
'/Q' is not recognized
c:\> cmd /Q /CECHO hi
hi
c:\> cmd /Q /C"ECHO hi"...
hi...
c:\> cmd /Q /C"ECHO hi"..."
hi"...


It's stupid, but it's a consistent stationary target. It may be impossible to 
anticipate how any unspecified executable splits args. But it's not impossible 
to anticipate how CMD will read its args when executed by a low-level library, 
independent of shells.


For any live input, or line in a batch file, CMD's role is to substitute env 
vars, and tokenize on pipelines and redirects (where not escaped with a caret) 
into (exe + all-args) pairs, to finally hand over to instances of 
CreateProcess(). Then it connects std streams together.

The second answer (by jeb) at this link goes into detail on the careted 
pipeline logic.
http://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts

In non-gawk languages with a system(WHATEVER) call, including C, they translate 
that into something similar to
CreateProcess(COMSPEC, "/C WHATEVER", other, proc, attributes);

Which, taken together, is why user-added fodder quotes work reliably to 
reproduce prompt-like behavior in a non-shell environment while invoking the 
CMD executable specifically.

Or am I missing something?


David Millis




reply via email to

[Prev in Thread] Current Thread [Next in Thread]