Re: [Qemu-devel] backup bug or question

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] backup bug or question

From:	John Snow
Subject:	Re: [Qemu-devel] backup bug or question
Date:	Fri, 9 Aug 2019 16:13:11 -0400
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 8/9/19 9:18 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi!
> 
> Hmm, hacking around backup I have a question:
> 
> What prevents guest write request after job_start but before setting
> write notifier?
> 
> code path:
> 
> qmp_drive_backup or transaction with backup
> 
>     job_start
>        aio_co_enter(job_co_entry) /* may only schedule execution, isn't it ? 
> */
> 
> ....
> 
> job_co_entry
>     job_pause_point() /* it definitely yields, isn't it bad? */
>     job->driver->run() /* backup_run */
> 
> ----
> 
> backup_run()
>     bdrv_add_before_write_notifier()
> 
> ...
> 

I think you're right... :(

We create jobs like this:

job->paused        = true;
job->pause_count   = 1;

And then job_start does this:

job->co = qemu_coroutine_create(job_co_entry, job);
job->pause_count--;
job->busy = true;
job->paused = false;

Which means that job_co_entry is being called before we lift the pause:

assert(job && job->driver && job->driver->run);
job_pause_point(job);
job->ret = job->driver->run(job, &job->err);

...Which means that we are definitely yielding in job_pause_point.

Yeah, that's a race condition waiting to happen.

> And what guarantees we give to the user? Is it guaranteed that write notifier 
> is
> set when qmp command returns?
> 
> And I guess, if we start several backups in a transaction it should be 
> guaranteed
> that the set of backups is consistent and correspond to one point in time...
> 

I would have hoped that maybe the drain_all coupled with the individual
jobs taking drain_start and drain_end would save us, but I guess we
simply don't have a guarantee that all backup jobs WILL have installed
their handler by the time the transaction ends.

Or, if there is that guarantee, I don't know what provides it, so I
think we shouldn't count on it accidentally working anymore.

I think we should do two things:

1. Move the handler installation to creation time.
2. Modify backup_before_write_notify to return without invoking
backup_do_cow if the job isn't started yet.

I'll send a patch in just a moment ...

--js

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] backup bug or question, Vladimir Sementsov-Ogievskiy, 2019/08/09
- Re: [Qemu-devel] backup bug or question, John Snow <=
  - Re: [Qemu-devel] backup bug or question, Vladimir Sementsov-Ogievskiy, 2019/08/10
    - Re: [Qemu-devel] backup bug or question, John Snow, 2019/08/12
    - Re: [Qemu-devel] backup bug or question, Vladimir Sementsov-Ogievskiy, 2019/08/12
- Re: [Qemu-devel] backup bug or question, Kevin Wolf, 2019/08/12
  - Re: [Qemu-devel] backup bug or question, Vladimir Sementsov-Ogievskiy, 2019/08/12
    - Re: [Qemu-devel] backup bug or question, Kevin Wolf, 2019/08/12
    - Re: [Qemu-devel] backup bug or question, Vladimir Sementsov-Ogievskiy, 2019/08/12

Prev by Date: Re: [Qemu-devel] [PATCH v3 03/14] migration.json: add AMD SEV specific migration parameters
Next by Date: [Qemu-devel] [PATCH] block/backup: install notifier during creation
Previous by thread: [Qemu-devel] backup bug or question
Next by thread: Re: [Qemu-devel] backup bug or question
Index(es):
- Date
- Thread