qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 07/14] vfio-user: run vfio-user context


From: Jag Raman
Subject: Re: [PATCH v4 07/14] vfio-user: run vfio-user context
Date: Fri, 17 Dec 2021 17:59:48 +0000


> On Dec 16, 2021, at 6:17 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> On Wed, Dec 15, 2021 at 10:35:31AM -0500, Jagannathan Raman wrote:
>> @@ -114,6 +118,62 @@ static void vfu_object_set_device(Object *obj, const 
>> char *str, Error **errp)
>>     vfu_object_init_ctx(o, errp);
>> }
>> 
>> +static void vfu_object_ctx_run(void *opaque)
>> +{
>> +    VfuObject *o = opaque;
>> +    int ret = -1;
>> +
>> +    while (ret != 0) {
>> +        ret = vfu_run_ctx(o->vfu_ctx);
>> +        if (ret < 0) {
>> +            if (errno == EINTR) {
>> +                continue;
>> +            } else if (errno == ENOTCONN) {
>> +                qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL);
>> +                o->vfu_poll_fd = -1;
>> +                object_unparent(OBJECT(o));
>> +                break;
> 
> If nothing else logs a message then I think that should be done here so
> users know why their vfio-user server object disappeared.

Sure will do.

Do you prefer a trace, or a message to the console? Trace makes sense to me.
Presently, the client could unplug the vfio-user device which would trigger the
deletion of this object. This process could happen quietly.

> 
>> +            } else {
>> +                error_setg(&error_abort, "vfu: Failed to run device %s - 
>> %s",
>> +                           o->device, strerror(errno));
> 
> error_abort is equivalent to assuming !o->daemon. In the case where the
> user doesn't want to automatically shut down the process we need to log
> a message without aborting.

OK, makes sense.

> 
>> +                 break;
> 
> Indentation is off.
> 
>> +            }
>> +        }
>> +    }
>> +}
>> +
>> +static void vfu_object_attach_ctx(void *opaque)
>> +{
>> +    VfuObject *o = opaque;
>> +    GPollFD pfds[1];
>> +    int ret;
>> +
>> +    qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL);
>> +
>> +    pfds[0].fd = o->vfu_poll_fd;
>> +    pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
>> +
>> +retry_attach:
>> +    ret = vfu_attach_ctx(o->vfu_ctx);
>> +    if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) {
>> +        qemu_poll_ns(pfds, 1, 500 * (int64_t)SCALE_MS);
>> +        goto retry_attach;
> 
> This can block the thread indefinitely. Other events like monitor
> commands are not handled in this loop. Please make this asynchronous
> (set an fd handler and return from this function so we can try again
> later).
> 
> The vfu_attach_ctx() implementation synchronously negotiates the
> vfio-user connection :(. That's a shame because even if accept(2) is
> handled asynchronously, the negotiation can still block. It would be
> cleanest to have a fully async libvfio-user's vfu_attach_ctx() API to
> avoid blocking. Is that possible?

Thanos / John,

    Any thoughts on this?

> 
> If libvfio-user can't make vfu_attach_ctx() fully async then it may be
> possible to spawn a thread just for the blocking vfu_attach_ctx() call
> and report the result back to the event loop (e.g. using EventNotifier).
> That adds a bunch of code to work around a blocking API though, so I
> guess we can leave the blocking part if necessary.
> 
> At the very minimum, please make EAGAIN/EWOULDBLOCK async as mentioned
> above and add a comment explaining the situation with the
> partially-async vfu_attach_ctx() API so it's clear that this can block
> (that way it's clear that you're aware of the issue and this isn't by
> accident).

Sure, we could make the attach async at QEMU depending on how the
library prefers to do this.

> 
>> +    } else if (ret < 0) {
>> +        error_setg(&error_abort,
>> +                   "vfu: Failed to attach device %s to context - %s",
>> +                   o->device, strerror(errno));
> 
> error_abort assumes !o->daemon. Please handle the o->daemon == true
> case by logging an error without aborting.
> 
>> +        return;
>> +    }
>> +
>> +    o->vfu_poll_fd = vfu_get_poll_fd(o->vfu_ctx);
>> +    if (o->vfu_poll_fd < 0) {
>> +        error_setg(&error_abort, "vfu: Failed to get poll fd %s", 
>> o->device);
> 
> Same here.
> 
>> @@ -208,6 +284,8 @@ static void vfu_object_init(Object *obj)
>>                    TYPE_VFU_OBJECT, TYPE_REMOTE_MACHINE);
>>         return;
>>     }
>> +
>> +    o->vfu_poll_fd = -1;
>> }
> 
> This must call qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL)
> when o->vfu_poll_fd != -1 to avoid leaving a dangling fd handler
> callback registered.

This is during the init phase, and the FD handlers are not set. Do you mean
to add this at finalize?

I agree it’s good to explicitly add this at finalize. But vfu_destroy_ctx() 
should
trigger a ENOTCONN, which would do it anyway.

Thank you!
--
Jag


reply via email to

[Prev in Thread] Current Thread [Next in Thread]