[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain cond
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions |
Date: |
Wed, 23 Nov 2016 09:40:50 +0000 |
User-agent: |
Mutt/1.7.1 (2016-10-04) |
On Tue, Nov 22, 2016 at 12:26:34PM -0500, John Snow wrote:
>
>
> On 11/22/2016 11:16 AM, Eric Blake wrote:
> > On 11/22/2016 10:07 AM, John Snow wrote:
> > >
> > >
> > > On 11/22/2016 07:01 AM, Nikolay Shirokovskiy wrote:
> > > > Hi, everyone.
> > > >
> > > > There is a problem with current incremental backups. Imagine I ask
> > > > qemu to
> > > > make an incremental backup then go away and return back when backup
> > > > job is finished. Qemu process dismisses the job completely and I missed
> > > > all the events so I don't know the result of the operation and what is
> > > > most important I don't know the base for dirty bitmap now. In case of
> > > > failure
> > > > it is previous backup and in case of success it is the last backup.
> > > > Qemu does
> > > > not track dirty bitmap base for me so I have no choice other then clear
> > > > dirty bitmap and make full backup which would be rather unexpected
> > > > from user
> > > > POV (The situation of going away/coming back is libvirt crash/restart
> > > > of course.)
> > > >
> > >
> > > Why was the completion/failure event missed? Is there some reason why
> > > you cannot guarantee that you will observe the completion?
> >
> > I think the intent of some of the on-error parameters is to make it so
> > that the job can't go away on error, only on success. Admittedly,
> > libvirt isn't using those policies as well as it could.
> >
> > >
> > > > I guess problem has wider scope. In case I miss successfull
> > > > completion of full
> > > > backup my only option is to drop backup file and redo the backup
> > > > completely
> > > > which is rather wasteful. AFAIU I can not query backup completion
> > > > result from
> > > > backup file itself. I guess there can be similar issues for other qemu
> > > > jobs.
> > > >
> > > > Nikolay
> > > >
> > >
> > > I would personally advocate for a job-neutral solution where jobs can be
> > > given a parameter such that the job persists in memory in a new
> > > "completed" state until such time that it is queried explicitly, then it
> > > can be dropped.
> > >
> > > I am not sure if we can make this the default behavior, as it might
> > > confuse libvirt to occasionally see jobs that have already completed.
> > >
> > > Talking to Kevin off-list, he suggested that we might be able to make
> > > this the default behavior if we pivot to the new jobs API that I have
> > > been proposing, accompanied by a new explicit command to put a command
> > > to rest.
> >
> > Yeah, revisiting the overall job API will require some overhaul in
> > libvirt as well, but it is probably worth it.
> >
>
> I wonder if I should try to rectify this temporarily for 2.9, or just jump
> straight into a new interface.
I suggest drafting the "proper" API fix. If it turns out to be a major
undertaking then maybe a sub-problem can be solved more easily instead.
But attacking the full problem first seems like a good approach - the
QEMU 2.9 development cycle hasn't even opened yet :).
Stefan
signature.asc
Description: PGP signature