bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: nofork command substitution


From: Chet Ramey
Subject: Re: nofork command substitution
Date: Mon, 22 May 2023 11:18:35 -0400
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.11.0

On 5/19/23 2:42 PM, Robert Elz wrote:
     Date:        Fri, 19 May 2023 12:03:51 -0400
     From:        Chet Ramey <chet.ramey@case.edu>
     Message-ID:  <0a85095a-1665-d936-b4fa-118dd158e5a8@case.edu>


   | Maybe, and certainly possible, but a more likely use is just a simple
   | assignment to REPLY.

In such cases, the value to be assigned needs to come from somewhere,
and I can't really see a lot of benefit in using this indirect method
of getting that value, rather that simply putting the somewhere directly
in the word, instead of this command substitution, except in the most
unlikely of cases (like when REPLY is built up gradually, via a loop
gradually adding more text).

I think either something like that or a conditional that sets REPLY to one
of several possible values are likely use cases. Being able to run an
arbitrary set of commands without forking has its advantages.

There's also the ability to do this without having to create and tear down
an anonymous file of some type for output. Since the whole point is to
avoid forking, you can't readily use a pipe.


   | Probably. The bash implementation is the union of mksh and ksh93's (with
   | one exception, see below), and that part comes from mksh.

Ah, that info wasn't apparent earlier - I wasn't aware this was just
a copy of earlier implementations, it seemed more like something new.

This discussion has been going on in this form (for this type of command
substitution) since at least 2020, and the general topic even longer. For
example:

https://lists.gnu.org/archive/html/help-bash/2020-05/msg00038.html

is the thread that inspired me to look at this stuff in the first place,
and test which characters are valid following the open brace.

   | The other benefit is not to have to output any text if you don't have to,
   | and for the calling shell not to have to read it.

That's all pseudo-noise --- the shell would need to set REPLY, and then
expand its value.  The alternative is simply to "write" the output into
a buffer in memory which the shell then "reads" when it wants the results
of the command substitution.  Those two are really no different, except
that the REPLY form does all of the work needed to assign to a var, and
then access it (one way or another) - plus everything necessary to make
REPLY a local variable.

Nope. If you're going to run arbitrary commands and capture their output,
you need a legitimate file descriptor for them to use. You can have the
user do something like use a tmpfs for $TMPDIR, or look at various
implementations of memfd_create/shm_open/shm_mkstemp, but you have to have
a file descriptor.

Of course all of this only works simply if the commands in the cmdsub
generating the text are built in to the shell, if something like sed
is being used (or awk) then you have to read its output anyway.

Since you need it for the general case, you need it in every case. Bash
uses stdio for its output, so the special-casing builtins to write to a
memory buffer is more work than is worth it.


   | I would have thought it obvious the space, tab, and newline variants do
   | the execute-command-in-the-current-environment-and-capture-output thing,
   | as described first, where the exceptions are explicitly detailed and
   | everything else is invalid. I can make the former even more explicit.

The question is more (aside from "compat with mksh" which I didn't previously
know was an aim - not sure 100% compat needs to be anyway) why 3 different
but equivalent forms are useful.   The space variant is obvious, that's how
we get this to be a cmdsub rather than a var expansion.   I can see the use
of the newline form, so one can write

        cmd arg arg word${
                sed -n /^p/p /usr/dict/words
                        }more

or something, without needing to put a hard to see space after the '{'.

I can't however see any real rational purpose for the tab variant.

Probably because Korn treated them all the same as shell metacharacters --
that is, they are in the same `class'.


   | I decided not to seek deliberate incompatibility with mksh.

Sure, that makes sense, but you don't need to necessarily implement
everything they implemented, when there is no need for it.

   | Remember where I said this was the union of the mksh and ksh93 features?
   | This one is from ksh93. It's not really any different from $(...). It
   | doesn't cost anything additional to support, either.

It might not cost anything to implement, but it needs to be explained,
documented, and then users forever need to try and understand why there
are two (seemingly equivalent) ways to achieve the same thing, and try
to work out which of them is the right one to use.

They can ignore it, if they like. You can get what you want thinking that
space and `|' are the only valid characters and everything else should
never be used.

   | You can have that, if you want. x=${ func; } (or x=${ func 1 2 3; }) does
   | the right thing. The ${...;} inherits the positional parameters, so things
   | like `shift' work as expected in the body,

I'm not sure I'd call what happens there "as expected" - I'd just call
it weird.

That is, of course, your right.


There's also the issue of in exactly what context these expansions happen.

This is not an issue specific to command substitution. The context is the
same as other word expansions. This is the same question as whether or not
variable assignments in word expansions in redirections change values in
the calling shell.


[Aside: when I first tried this, I forgot the need for the ';' before the '}'
  but, surprisingly, it still worked exactly the same way, mksh it seems
simply finds a terminating '}' and uses it - reserved word type matching not
required.]

I'd call that a bug. It's not how mksh documents this type of command
substitution to work. ksh93 documents the parsing the same way.


   | It's to allow local variables and `return'. The existing implementations
   | all agree on that.

Sure, the question isn't that, it is why it doesn't also make local
positional params, just the same as any other function.   (But return
and local can both be made to work in any context you like, there's
nothing that says they are only allowed in a function, or something like
it, other than any rules you impose upon yourself ... it seems like as
described, the "${ " form is more like execution of a '.' script, than
a function.

It's more like `eval' on a group command with local variables and return.
Dot scripts don't have local variables, either.


   | Expansions are performed left-to-right (or, if you prefer, beginning to
   | end) in the word, and the command substitution modifies the current
   | execution environment.

Sure, but last time I looked, it was still unspecified just when side
effects of expansions get applied.

Bash is consistent about applying those effects, and this form of command
substitution doesn't change that.

   | There's no reason to be deliberately incompatible.

Sure, understood, but also no reason to implement nonsense, or frills
which are not needed.  Make what is implemented be compatible, by all
means, but that does not mean you need to implement everything.

Maybe not everyone would consider the same things to be frills that you do.


After all mksh doesn't attempt to be compat with bash wrt brace expansion.
bash does it the sane way (first) - mksh (maybe kshNN as well) the insane
way (after parameter expansions) - nothing at all compatible there.

That's too bad. My guess is that it uses the BSD glob(3) function, which
for some reason performs brace expansion if you pass the GLOB_BRACE flag.

The issues with all of this are more the increased complexity for the user
trying to work out how it all works, and why.

So what wouuld you add to the documentation of the feature to make it clearer?

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]