qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Date: Thu, 19 Jul 2018 11:05:34 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Peter Xu <address@hidden> writes:

> On Thu, Jul 19, 2018 at 09:20:34AM +0200, Markus Armbruster wrote:
>> Peter Xu <address@hidden> writes:
>> 
>> > On Wed, Jul 18, 2018 at 05:38:11PM +0200, Markus Armbruster wrote:
>> >> Peter Xu <address@hidden> writes:
>> >> 
>> >> > After the Out-Of-Band work, the monitor iothread may be accessing the
>> >> > cur_mon as well (via monitor_qmp_dispatch_one()).

Since renamed to monitor_qmp_dispatch().

Further down, we concluded that cur_mon isn't actually used from the I/O
thread, didn't we?

>> >> >                                                    Let's convert the
>> >> > cur_mon variable to be a per-thread variable to make sure there won't be
>> >> > a race between threads when accessing the variable.
>> >> 
>> >> Hmm... why hasn't the OOB work created such a race already?
>> >> 
>> >> A monitor reads, parses, dispatches and executes commands, formats and
>> >> sends replies.
>> >> 
>> >> Before OOB, all of that ran in the main thread.  Any access of cur_mon
>> >> should therefore be from the main thread.  No races.
>> >> 
>> >> OOB moves read, parse, format and send to an I/O thread.  Dispatch and
>> >> execute remain in the main thread.  *Except* for commands executed OOB,
>> >> dispatch and execute move to the I/O thread, too.
>> >> 
>> >> Why is this not racy?  I guess it relies on careful non-use of cur_mon
>> >> in any part that may now execute in the I/O thread.  Scary...
>> >
>> > I think it's because cur_mon is not really used in out-of-band command
>> > executions - now we only have a few out-of-band enabled commands, and
>> > IIUC none of them is using cur_mon (for example, in
>> > qmp_migrate_recover() we don't even call error_report, and the code
>> > path is quite straight forward to make sure of that).  So IIUC cur_mon
>> > variable is still only touched by main thread for now hence we should
>> > be safe.  However that condition might change in the future when we
>> > add more out-of-band capable commands.
>> >
>> > (not to mention that I don't even know whether there are real users of
>> >  out-of-band if we haven't yet started to support that for libvirt...)
>> 
>> It's not just the actual OOB commands (there are just two), it's also
>> the monitor code to read, parse, format and send.
>
> My understanding is that read, parse, format, send will not touch
> cur_mon (it was touched before but some patches in the out-of-band
> series should have removed the last users when parsing).  So IIUC only
> the dispatcher would touch that now.  I didn't consider the callers
> like net_init_socket() and I'm only considering the monitor code (and
> those callers should be only in the main thread too after all).

There *is* cur_mon use outside dispatch & execute, e.g.

    void error_vprintf(const char *fmt, va_list ap)
    {
        if (cur_mon && !monitor_cur_is_qmp()) {
            monitor_vprintf(cur_mon, fmt, ap);
        } else {
            vfprintf(stderr, fmt, ap);
        }
    }

Obviously unsafe to use outside the main thread.  Consider:

    bool monitor_cur_is_qmp(void)
    {
        return cur_mon && monitor_is_qmp(cur_mon);
    }

    static inline bool monitor_is_qmp(const Monitor *mon)
    {
        return (mon->flags & MONITOR_USE_CONTROL);
    }

If monitor_cur_is_qmp() reads cur_mon twice (which it is entitled to
do), this crashes when the main thread sets cur_mon back to null in
between.

Did the OOB work make things any worse?  Let's see.

@cur_mon is null unless the main thread is running monitor code, either
HMP within monitor_read():

    cur_mon = opaque;

    if (cur_mon->rs) {
        for (i = 0; i < size; i++)
            readline_handle_byte(cur_mon->rs, buf[i]);
    } else {
        if (size == 0 || buf[size - 1] != 0)
            monitor_printf(cur_mon, "corrupted command\n");
        else
            handle_hmp_command(cur_mon, (char *)buf);
    }

    cur_mon = old_mon;

or QMP within monitor_qmp_dispatch():

    old_mon = cur_mon;
    cur_mon = mon;

    rsp = qmp_dispatch(mon->qmp.commands, req, qmp_oob_enabled(mon));

    cur_mon = old_mon;

In both cases, old_mon is always null.

Fine print: before commit 227a07552f3 "monitor: move the cur_mon hack
deeper for QMP", we ran more code for QMP with cur_mon set, namely the
JSON parser, but that doesn't matter here.

More fine print: there's also qmp_human_monitor_command(), which stacks
an HMP monitor on top of the QMP monitor.  Also doesn't matter here.

The OOB work doesn't add any new races as long as

* it doesn't add assignments to @cur_mon, and

* none of the code it moves out of the main thread accesses @cur_mon.

The first condition obviously holds.  The second one isn't obvious, but
I figure it holds, too.

Okay, I think I've convince myself the OOB work didn't add
cur_mon-related races.

>> >> Should this go into 3.0 to reduce the risk of bugs?
>> >
>> > Yes I think it would be good to have that even for 3.0, since it still
>> > can be seen as a bug fix of existing code.
>> 
>> Agreed.
>> 
>> > Regards,
>> >
>> >> > Note that thread variables are not initialized to a valid value when new
>> >> > thread is created.
>> 
>> Confusing.  It sounds like @cur_mon's initial value would be
>> indeterminate, like an automatic variable's.  Not true.  Variables with
>> thread storage duration are initialized when the thread is created.
>> Since @cur_mon's declaration lacks an initializer, it'll be initialized
>> to a null pointer.  Your sentence is correct when you consider that null
>> pointer not a valid value.
>
> Yes that's what I meant.  So how about this?
>
>   Note that the per-thread @cur_mon variable is not initialized to
>   point to a valid Monitor struct when a new thread is created (the
>   default value will be NULL).
>
> Please feel free to tune it up.

I think what the patch really changes is the value of @cur_mon outside
the main thread: it remains null there now.  Before, it depended on what
the main thread was doing, and therefore could not be used safely.

In other words, the patch makes uses of @cur_mon like the one in
error_vprintf() shown above safe.

I think that's what we should explain in the commit message.  I can try
rewriting it, but right now I got to run.

>> 
>> >> >                     However for our case we don't need to set it up,
>> >> > since the cur_mon variable is only used in such a pattern:
>> >> > 
>> >> >   old_mon = cur_mon;
>> >> >   cur_mon = xxx;
>> >> >   (do something, read cur_mon if necessary in the stack)
>
> [1]
>
>> >> >   cur_mon = old_mon;
>> >> > 
>> >> > It plays a role as stack variable, so no need to be initialized at all.
>> >> > We only need to make sure the variable won't be changed unexpectedly by
>> >> > other threads.
>> 
>> Do we need this paragraph?  The commit doesn't mess with @cur_mon's
>> initial value at all...
>
> I was trying to explain why we don't need to initialize that variable
> for each thread.  A common idea (at least that's what I have had in
> mind) is that when we create a new thread we should possibly inherit
> that @cur_mon variable in a copy-on-write fashion for that new thread.
> But that's not really necessary for the use case like above (as long
> as we don't create thread during [1], and that's what we do).
>
> If you think the patch explains itself better without these lines,
> please feel free to drop it.
>
>> 
>> >> > Reviewed-by: Eric Blake <address@hidden>
>> >> > Reviewed-by: Marc-André Lureau <address@hidden>
>> >> > Reviewed-by: Stefan Hajnoczi <address@hidden>
>> >> > [peterx: touch up commit message a bit]
>> >> > Signed-off-by: Peter Xu <address@hidden>
>
> Thanks,



reply via email to

[Prev in Thread] Current Thread [Next in Thread]