[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/10] vhost/qemu: thread per IO SCSI vq

From: Mike Christie
Subject: Re: [PATCH 00/10] vhost/qemu: thread per IO SCSI vq
Date: Thu, 19 Nov 2020 10:43:40 -0600
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.4.3

On 11/19/20 10:24 AM, Stefan Hajnoczi wrote:
On Thu, Nov 19, 2020 at 4:13 PM Mike Christie
<michael.christie@oracle.com> wrote:

On 11/19/20 8:46 AM, Michael S. Tsirkin wrote:
On Wed, Nov 18, 2020 at 11:31:17AM +0000, Stefan Hajnoczi wrote:
My preference has been:

1. If we were to ditch cgroups, then add a new interface that would allow
us to bind threads to a specific CPU, so that it lines up with the guest's
mq to CPU mapping.

A 1:1 vCPU/vq->CPU mapping isn't desirable in all cases.

The CPU affinity is a userspace policy decision. The host kernel should
provide a mechanism but not the policy. That way userspace can decide
which workers are shared by multiple vqs and on which physical CPUs they
should run.

So if we let userspace dictate the threading policy then I think binding
vqs to userspace threads and running there makes the most sense,
no need to create the threads.

Just to make sure I am on the same page, in one of the first postings of
this set at the bottom of the mail:


I asked about a new interface and had done something more like what
Stefan posted:

    struct vhost_vq_worker_info {
         * The pid of an existing vhost worker that this vq will be
         * assigned to. When pid is 0 the virtqueue is assigned to the
         * default vhost worker. When pid is -1 a new worker thread is
         * created for this virtqueue. When pid is -2 the virtqueue's
         * worker thread is unchanged.
         * If a vhost worker no longer has any virtqueues assigned to it
         * then it will terminate.
         * The pid of the vhost worker is stored to this field when the
         * ioctl completes successfully. Use pid -2 to query the current
         * vhost worker pid.
        __kernel_pid_t pid;  /* in/out */

        /* The virtqueue index*/
        unsigned int vq_idx; /* in */

This approach is simple and it allowed me to have userspace map queues
and threads optimally for our setups.

Note: Stefan, in response to your previous comment, I am just using my
1:1 mapping as an example and would make it configurable from userspace.

In the email above are you guys suggesting to execute the SCSI/vhost
requests in userspace? We should not do that because:

1. It negates part of what makes vhost fast where we do not have to kick
out to userspace then back to the kernel.

2. It's not doable or becomes a crazy mess because vhost-scsi is tied to
the scsi/target layer in the kernel. You can't process the scsi command
in userspace since the scsi state machine and all its configuration info
is in the kernel's scsi/target layer.

For example, I was just the maintainer of the target_core_user module
that hooks into LIO/target on the backend (vhost-scsi hooks in on the
front end) and passes commands to userspace and there we have a
semi-shadow state machine. It gets nasty to try and maintain/sync state
between lio/target core in the kernel and in userspace. We also see the
perf loss I mentioned in #1.

No, if I understand Michael correctly he has suggested a different approach.

My suggestion was that the kernel continues to manage the worker
threads but an ioctl allows userspace to control the policy.

I think Michael is saying that the kernel shouldn't manage/create
threads. Userspace should create threads and then invoke an ioctl from
those threads.

The ioctl will call into the vhost driver where it will execute
something similar to vhost_worker(). So this ioctl will block while
the kernel is using the thread to process vqs.

What isn't clear to me is how to tell the kernel which vqs are
processed by a thread. We could try to pass that information into the
ioctl. I'm not sure what the cleanest solution is here.

Maybe something like:

struct vhost_run_worker_info {
     struct timespec *timeout;
     sigset_t *sigmask;

     /* List of virtqueues to process */
     unsigned nvqs;
     unsigned vqs[];

/* This blocks until the timeout is reached, a signal is received, or
the vhost device is destroyed */
int ret = ioctl(vhost_fd, VHOST_RUN_WORKER, &info);

As you can see, userspace isn't involved with dealing with the
requests. It just acts as a thread donor to the vhost driver.

We would want the VHOST_RUN_WORKER calls to be infrequent to avoid the
penalty of switching into the kernel, copying in the arguments, etc.

I didn't get this part. Why have the timeout? When the timeout expires, does userspace just call right back down to the kernel or does it do some sort of processing/operation?

You could have your worker function run from that ioctl wait for a signal or a wake up call from the vhost_work/poll functions.

Michael: is this the kind of thing you were thinking of?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]