[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Safely reopening image files by stashing fds
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] Safely reopening image files by stashing fds |
Date: |
Tue, 09 Aug 2011 14:24:23 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:5.0) Gecko/20110707 Thunderbird/5.0 |
Am 09.08.2011 14:00, schrieb Stefan Hajnoczi:
> On Tue, Aug 09, 2011 at 01:39:13PM +0200, Kevin Wolf wrote:
>> Am 09.08.2011 12:56, schrieb Stefan Hajnoczi:
>>> On Tue, Aug 9, 2011 at 11:50 AM, Stefan Hajnoczi <address@hidden> wrote:
>>>> On Tue, Aug 9, 2011 at 11:35 AM, Kevin Wolf <address@hidden> wrote:
>>>>> Am 09.08.2011 12:25, schrieb Stefan Hajnoczi:
>>>>>> On Mon, Aug 8, 2011 at 4:16 PM, Kevin Wolf <address@hidden> wrote:
>>>>>>> Am 08.08.2011 16:49, schrieb Stefan Hajnoczi:
>>>>>>>> On Fri, Aug 5, 2011 at 10:48 AM, Kevin Wolf <address@hidden> wrote:
>>>>>>>>> Am 05.08.2011 11:29, schrieb Stefan Hajnoczi:
>>>>>>>>>> On Fri, Aug 5, 2011 at 10:07 AM, Kevin Wolf <address@hidden> wrote:
>>>>>>>>>>> Am 05.08.2011 10:40, schrieb Stefan Hajnoczi:
>>>>>>>>>>>> We've discussed safe methods for reopening image files (e.g.
>>>>>>>>>>>> useful for
>>>>>>>>>>>> changing the hostcache parameter). The problem is that closing
>>>>>>>>>>>> the file first
>>>>>>>>>>>> and then opening it again exposes us to the error case where the
>>>>>>>>>>>> open fails.
>>>>>>>>>>>> At that point we cannot get to the file anymore and our options
>>>>>>>>>>>> are to
>>>>>>>>>>>> terminate QEMU, pause the VM, or offline the block device.
>>>>>>>>>>>>
>>>>>>>>>>>> This window of vulnerability can be eliminated by keeping the file
>>>>>>>>>>>> descriptor
>>>>>>>>>>>> around and falling back to it should the open fail.
>>>>>>>>>>>>
>>>>>>>>>>>> The challenge for the file descriptor approach is that image
>>>>>>>>>>>> formats, like
>>>>>>>>>>>> VMDK, can span multiple files. Therefore the solution is not as
>>>>>>>>>>>> simple as
>>>>>>>>>>>> stashing a single file descriptor and reopening from it.
>>>>>>>>>>>
>>>>>>>>>>> So far I agree. The rest I believe is wrong because you can't assume
>>>>>>>>>>> that every backend uses file descriptors. The qemu block layer is
>>>>>>>>>>> based
>>>>>>>>>>> on BlockDriverStates, not fds. They are a concept that should be
>>>>>>>>>>> hidden
>>>>>>>>>>> in raw-posix.
>>>>>>>>>>>
>>>>>>>>>>> I think something like this could do:
>>>>>>>>>>>
>>>>>>>>>>> struct BDRVReopenState {
>>>>>>>>>>> BlockDriverState *bs;
>>>>>>>>>>> /* can be extended by block drivers */
>>>>>>>>>>> };
>>>>>>>>>>>
>>>>>>>>>>> .bdrv_reopen(BlockDriverState *bs, BDRVReopenState **reopen_state,
>>>>>>>>>>> int
>>>>>>>>>>> flags);
>>>>>>>>>>> .bdrv_reopen_commit(BDRVReopenState *reopen_state);
>>>>>>>>>>> .bdrv_reopen_abort(BDRVReopenState *reopen_state);
>>>>>>>>>>>
>>>>>>>>>>> raw-posix would store the old file descriptor in its reopen_state.
>>>>>>>>>>> On
>>>>>>>>>>> commit, it closes the old descriptors, on abort it reverts to the
>>>>>>>>>>> old
>>>>>>>>>>> one and closes the newly opened one.
>>>>>>>>>>>
>>>>>>>>>>> Makes things a bit more complicated than the simple bdrv_reopen I
>>>>>>>>>>> had in
>>>>>>>>>>> mind before, but it allows VMDK to get an all-or-nothing semantics.
>>>>>>>>>>
>>>>>>>>>> Can you show how bdrv_reopen() would use these new interfaces? I'm
>>>>>>>>>> not 100% clear on the idea.
>>>>>>>>>
>>>>>>>>> Well, you wouldn't only call bdrv_reopen, but also either
>>>>>>>>> bdrv_reopen_commit/abort (for the top-level caller we can have a
>>>>>>>>> wrapper
>>>>>>>>> function that does both, but that's syntactic sugar).
>>>>>>>>>
>>>>>>>>> For example we would have:
>>>>>>>>>
>>>>>>>>> int vmdk_reopen()
>>>>>>>>
>>>>>>>> .bdrv_reopen() is a confusing name for this operation because it does
>>>>>>>> not reopen anything. bdrv_prepare_reopen() might be clearer.
>>>>>>>
>>>>>>> Makes sense.
>>>>>>>
>>>>>>>>
>>>>>>>>> {
>>>>>>>>> *((VMDKReopenState**) rs) = malloc();
>>>>>>>>>
>>>>>>>>> foreach (extent in s->extents) {
>>>>>>>>> ret = bdrv_reopen(extent->file, &extent->reopen_state)
>>>>>>>>> if (ret < 0)
>>>>>>>>> goto fail;
>>>>>>>>> }
>>>>>>>>> return 0;
>>>>>>>>>
>>>>>>>>> fail:
>>>>>>>>> foreach (extent in rs->already_reopened) {
>>>>>>>>> bdrv_reopen_abort(extent->reopen_state);
>>>>>>>>> }
>>>>>>>>> return ret;
>>>>>>>>> }
>>>>>>>>
>>>>>>>>> void vmdk_reopen_commit()
>>>>>>>>> {
>>>>>>>>> foreach (extent in s->extents) {
>>>>>>>>> bdrv_reopen_commit(extent->reopen_state);
>>>>>>>>> }
>>>>>>>>> free(rs);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> void vmdk_reopen_abort()
>>>>>>>>> {
>>>>>>>>> foreach (extent in s->extents) {
>>>>>>>>> bdrv_reopen_abort(extent->reopen_state);
>>>>>>>>> }
>>>>>>>>> free(rs);
>>>>>>>>> }
>>>>>>>>
>>>>>>>> Does the caller invoke bdrv_close(bs) after bdrv_prepare_reopen(bs,
>>>>>>>> &rs)?
>>>>>>>
>>>>>>> No. Closing the old backend would be part of bdrv_reopen_commit.
>>>>>>>
>>>>>>> Do you have a use case where it would be helpful if the caller invoked
>>>>>>> bdrv_close?
>>>>>>
>>>>>> When the caller does bdrv_close() two BlockDriverStates are never open
>>>>>> for the same image file. I thought this was a property we wanted.
>>>>>>
>>>>>> Also, in the block_set_hostcache case we need to reopen without
>>>>>> switching to a new BlockDriverState instance. That means the reopen
>>>>>> needs to be in-place with respect to the BlockDriverState *bs pointer.
>>>>>> We cannot create a new instance.
>>>>>
>>>>> Yes, but where do you even get the second BlockDriverState from?
>>>>>
>>>>> My prototype only returns an int, not a new BlockDriverState. Until
>>>>> bdrv_reopen_commit() it would refer to the old file descriptors etc. and
>>>>> after bdrv_reopen_commit() the very same BlockDriverState would refer to
>>>>> the new ones.
>>>>
>>>> It seems I don't understand the API. I thought it was:
>>>>
>>>> do_block_set_hostcache()
>>>> {
>>>> bdrv_prepare_reopen(bs, &rs);
>>>> ...open new file and check everything is okay...
>>>> if (ret == 0) {
>>>> bdrv_reopen_commit(rs);
>>>> } else {
>>>> bdrv_reopen_abort(rs);
>>>> }
>>>> return ret;
>>>> }
>>>>
>>>> If the caller isn't opening the new file then what's the point of
>>>> giving the caller control over prepare, commit, and abort?
>>>
>>> After sending the last email I realized what I was missing:
>>>
>>> You need the prepare, commit, and abort API in order to handle
>>> multi-file block drivers like VMDK.
>>
>> Yes, this is whole point of separating commit out. Does the proposal
>> make sense to you now?
>
> It depends on the details. Adding more functions that every BlockDriver
> must implement is bad, so it's important that we only drop this
> functionality into raw-posix.c, vmdk.c, and block.c as appropriate.
Yes, I agree.
> I liked the idea of doing a generic FDStash type that the monitor and
> bdrv_reopen() can use. Blue's idea to hook at the qemu_open() level
> takes that further.
Well, having something that works for the raw-posix, the monitor and
maybe some more things is nice. Having something that works for
raw-posix, Sheepdog and rbd is an absolute requirement and I can't see
how FDStash solves that. Even raw-win32 doesn't have an int fd, but a
HANDLE hfile for its backend.
So my main concern is that file descriptors are a concept not as generic
as it needs to be to suit all block drivers.
> But if we can do prepare, commit, and abort in a relatively simple way
> then I'm for it.
Great, then let's do that.
Kevin
- Re: [Qemu-devel] Safely reopening image files by stashing fds, (continued)
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Stefan Hajnoczi, 2011/08/05
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Kevin Wolf, 2011/08/05
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Stefan Hajnoczi, 2011/08/08
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Kevin Wolf, 2011/08/08
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Stefan Hajnoczi, 2011/08/09
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Kevin Wolf, 2011/08/09
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Stefan Hajnoczi, 2011/08/09
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Stefan Hajnoczi, 2011/08/09
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Kevin Wolf, 2011/08/09
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Stefan Hajnoczi, 2011/08/09
- Re: [Qemu-devel] Safely reopening image files by stashing fds,
Kevin Wolf <=
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Blue Swirl, 2011/08/09
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Kevin Wolf, 2011/08/10
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Blue Swirl, 2011/08/10
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Kevin Wolf, 2011/08/11
- Re: [Qemu-devel] Safely reopening image files by stashing fds, Blue Swirl, 2011/08/11
Re: [Qemu-devel] Safely reopening image files by stashing fds, Blue Swirl, 2011/08/05