[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/repl
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay |
Date: |
Mon, 15 Feb 2016 15:06:35 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Am 15.02.2016 um 14:54 hat Pavel Dovgalyuk geschrieben:
> > From: Kevin Wolf [mailto:address@hidden
> > Am 15.02.2016 um 10:14 hat Pavel Dovgalyuk geschrieben:
> > > > From: Pavel Dovgalyuk [mailto:address@hidden
> > > > > From: Kevin Wolf [mailto:address@hidden
> > > > > > >
> > > > > > > int blkreplay_co_readv()
> > > > > > > {
> > > > > > > BlockReplayState *s = bs->opaque;
> > > > > > > int reqid = s->reqid++;
> > > > > > >
> > > > > > > bdrv_co_readv(bs->file, ...);
> > > > > > >
> > > > > > > if (mode == record) {
> > > > > > > log(reqid, time);
> > > > > > > } else {
> > > > > > > assert(mode == replay);
> > > > > > > bool *done = req_replayed_list_get(reqid)
> > > > > > > if (done) {
> > > > > > > *done = true;
> > > > > > > } else {
> > > > > > point A
> > > > > > > req_completed_list_insert(reqid,
> > > > > > > qemu_coroutine_self());
> > > > > > > qemu_coroutine_yield();
> > > > > > > }
> > > > > > > }
> > > > > > > }
> > > > > > >
> > > > > > > /* called by replay.c */
> > > > > > > int blkreplay_run_event()
> > > > > > > {
> > > > > > > if (mode == replay) {
> > > > > > > co = req_completed_list_get(e.reqid);
> > > > > > > if (co) {
> > > > > > > qemu_coroutine_enter(co);
> > > > > > > } else {
> > > > > > > bool done = false;
> > > > > > > req_replayed_list_insert(reqid, &done);
> > > > > > point B
> > > > > > > /* wait synchronously for completion */
> > > > > > > while (!done) {
> > > > > > > aio_poll();
> > > > > > > }
> > > > > > > }
> > > > > > > }
> > > > > > > }
> > > > > >
> > > > > > One more question about coroutines.
> > > > > > Are race conditions possible in this sample?
> > > > > > In replay mode we may call readv, and reach point A.
> > > > > > On the same time, we will read point B in another thread.
> > > > > > Then readv will yield and nobody will start it back?
> > > > >
> > > > > There are two aspects to this:
> > > > >
> > > > > * Real multithreading doesn't exist in the block layer. All block
> > > > > driver
> > > > > functions are only called with the mutex in the AioContext held.
> > > > > There
> > > > > is exactly one AioContext per BDS, so no two threads can possible be
> > > > > operating on the same BDS at the same time.
> > > > >
> > > > > * Coroutines are different from threads in that they aren't
> > > > > preemptive.
> > > > > They are only interrupted in places where they explicitly yield.
> > > > >
> > > > > Of course, in order for this to work, we actually need to take the
> > > > > mutex
> > > > > before calling blkreplay_run_event(), which is called directly from
> > > > > the
> > > > > replay code (which runs in the mainloop thread? Or vcpu?).
> > > >
> > > > blkreplay_run_event() is called from replay code which is protected by
> > > > mutex.
> > > > This function may be called from io and vcpu threads, because both of
> > > > them
> > > > have replay functions invocations.
> > >
> > > Now I've encountered a situation where blkreplay_run_event is called from
> > > read coroutine:
> > > bdrv_prwv_co -> aio_poll -> qemu_clock_get_ns -> replay_read_clock ->
> > > blkreplay_run_event
> > > \--> bdrv_co_readv -> blkreplay_co_readv ->
> > > bdrv_co_readv(lower layer)
> > >
> > > bdrv_co_readv inside blkreplay_co_readv can't proceed in this situation.
> > > This is probably because aio_poll has taken the aio context?
> > > How can I resolve this?
> >
> > First of all, I'm not sure if running replay events from
> > qemu_clock_get_ns() is such a great idea. This is not a function that
> > callers expect to change any state. If you absolutely have to do it
> > there instead of in the clock device emulations, maybe restricting it to
> > replaying clock events could make it a bit more harmless.
>
> Only virtual clock is emulated, and host clock is read from the host
> real time sources and therefore has to be saved into the log.
Isn't the host clock invisible to the guest anyway?
> There could be asynchronous events that occur in non-cpu threads.
> For now these events are shutdown request and block task execution.
> They may "hide" following clock (or another one) events. That is why
> we process them until synchronous event (like clock, instructions
> execution, or checkpoint) is met.
>
>
> > Anyway, what does "can't proceed" mean? The coroutine yields because
> > it's waiting for I/O, but it is never reentered? Or is it hanging while
> > trying to acquire a lock?
>
> I've solved this problem by slightly modifying the queue.
> I haven't yet made BlockDriverState assignment to the request ids.
> Therefore aio_poll was temporarily replaced with usleep.
> Now execution starts and hangs at some random moment of OS loading.
>
> Here is the current version of blkreplay functions:
>
> static int coroutine_fn blkreplay_co_readv(BlockDriverState *bs,
> int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
> {
> uint32_t reqid = request_id++;
> Request *req;
> req = block_request_insert(reqid, bs, qemu_coroutine_self());
> bdrv_co_readv(bs->file->bs, sector_num, nb_sectors, qiov);
>
> if (replay_mode == REPLAY_MODE_RECORD) {
> replay_save_block_event(reqid);
> } else {
> assert(replay_mode == REPLAY_MODE_PLAY);
> qemu_coroutine_yield();
> }
> block_request_remove(req);
>
> return 0;
> }
>
> void replay_run_block_event(uint32_t id)
> {
> Request *req;
> if (replay_mode == REPLAY_MODE_PLAY) {
> while (!(req = block_request_find(id))) {
> //aio_poll(bdrv_get_aio_context(req->bs), true);
> usleep(1);
> }
How is this loop supposed to make any progress?
And I still don't understand why aio_poll() doesn't work and where it
hangs.
Kevin
> qemu_coroutine_enter(req->co, NULL);
> }
> }
>
> > Can you provide more detail about the exact place where it's hanging,
> > both in the coroutine and in the main "coroutine" that executes
> > aio_poll()?
>
> In this version replay_run_block_event() executes while loop.
> I haven't found what other threads do, because the debugger doesn't show me
> call stack when thread is waiting in some blocking function.
>
> Pavel Dovgalyuk
>
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, (continued)
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Kevin Wolf, 2016/02/12
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Pavel Dovgalyuk, 2016/02/12
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Kevin Wolf, 2016/02/12
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Pavel Dovgalyuk, 2016/02/15
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Kevin Wolf, 2016/02/15
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Pavel Dovgalyuk, 2016/02/15
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Kevin Wolf, 2016/02/15
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Pavel Dovgalyuk, 2016/02/15
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Kevin Wolf, 2016/02/15
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Pavel Dovgalyuk, 2016/02/15
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay,
Kevin Wolf <=
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Pavel Dovgalyuk, 2016/02/15
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Kevin Wolf, 2016/02/15
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Pavel Dovgalyuk, 2016/02/16
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Kevin Wolf, 2016/02/16
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Pavel Dovgalyuk, 2016/02/16
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Kevin Wolf, 2016/02/16
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Pavel Dovgalyuk, 2016/02/18
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Pavel Dovgalyuk, 2016/02/20
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Kevin Wolf, 2016/02/22
- Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay, Pavel Dovgalyuk, 2016/02/24