[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v6 2/2] block: Support GlusterFS as a QEMU block
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] [PATCH v6 2/2] block: Support GlusterFS as a QEMU block backend |
Date: |
Wed, 15 Aug 2012 10:00:27 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120605 Thunderbird/13.0 |
Am 15.08.2012 07:21, schrieb Bharata B Rao:
> On Tue, Aug 14, 2012 at 10:29:26AM +0200, Kevin Wolf wrote:
>>>>> +static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void
>>>>> *arg)
>>>>> +{
>>>>> + GlusterAIOCB *acb = (GlusterAIOCB *)arg;
>>>>> + BDRVGlusterState *s = acb->common.bs->opaque;
>>>>> +
>>>>> + acb->ret = ret;
>>>>> + if (qemu_gluster_send_pipe(s, acb) < 0) {
>>>>> + /*
>>>>> + * Gluster AIO callback thread failed to notify the waiting
>>>>> + * QEMU thread about IO completion. Nothing much can be done
>>>>> + * here but to abruptly abort.
>>>>> + *
>>>>> + * FIXME: Check if the read side of the fd handler can somehow
>>>>> + * be notified of this failure paving the way for a graceful
>>>>> exit.
>>>>> + */
>>>>> + error_report("Gluster failed to notify QEMU about IO
>>>>> completion");
>>>>> + abort();
>>>>
>>>> In the extreme case you may choose to make this disk inaccessible
>>>> (something like bs->drv = NULL), but abort() kills the whole VM and
>>>> should only be called when there is a bug.
>>>
>>> There have been concerns raised about this earlier too. I settled for this
>>> since I couldn't see a better way out and I could see the precedence
>>> for this in posix-aio-compat.c
>>>
>>> So I could just do the necessary cleanup, set bs->drv to NULL and return
>>> from
>>> here ? But how do I wake up the QEMU thread that is waiting on the read side
>>> of the pipe ? W/o that, the QEMU thread that waits on the read side of the
>>> pipe is still hung.
>>
>> There is no other thread. But you're right, you should probably
>> unregister the aio_fd_handler and any other pending callbacks.
>
> As I clarified in the other mail, this (gluster_finish_aiocb) is called
> from gluster thread context and hence QEMU thread that raised the original
> read/write request is still blocked on qemu_aio_wait().
>
> I tried the following cleanup instead of abrupt abort:
>
> close(read_fd); /* This will wake up the QEMU thread blocked on
> select(read_fd...) */
> close(write_fd);
> qemu_aio_set_fd_handler(read_fd, NULL, NULL, NULL, NULL);
> qemu_aio_release(acb);
> s->qemu_aio_count--;
> bs->drv = NULL;
>
> I tested this by manually injecting faults into qemu_gluster_send_pipe().
> With the above cleanup, the guest kernel crashes with IO errors.
What does "crash" really mean? IO errors certainly shouldn't cause a
kernel to crash?
> Is there anything else that I need to do or do differently to retain the
> VM running w/o disk access ?
>
> I thought of completing the aio callback by doing
> acb->common.cb(acb->common.opaque, -EIO);
> but that would do a coroutine enter from gluster thread, which I don't think
> should be done.
You would have to take the global qemu mutex at least. I agree it's not
a good thing to do.
Kevin
Re: [Qemu-devel] [PATCH v6 0/2] GlusterFS support in QEMU - v6, Bharata B Rao, 2012/08/13