qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 06/10] vdpa: Make vhost_vdpa_net_cvq_map_elem accept any o


From: Eugenio Perez Martin
Subject: Re: [PATCH v5 06/10] vdpa: Make vhost_vdpa_net_cvq_map_elem accept any out sg
Date: Thu, 4 Aug 2022 13:21:24 +0200

On Thu, Aug 4, 2022 at 10:52 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Thu, Aug 4, 2022 at 4:19 PM Eugenio Perez Martin <eperezma@redhat.com> 
> wrote:
> >
> > On Thu, Aug 4, 2022 at 9:51 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Thu, Aug 4, 2022 at 3:39 PM Eugenio Perez Martin <eperezma@redhat.com> 
> > > wrote:
> > > >
> > > > On Thu, Aug 4, 2022 at 6:17 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > >
> > > > > 在 2022/8/3 01:57, Eugenio Pérez 写道:
> > > > > > So its generic enough to accept any out sg buffer and we can inject
> > > > > > NIC state messages.
> > > > > >
> > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > ---
> > > > > > v5: Accept out sg instead of dev_buffers[]
> > > > > > ---
> > > > > >   net/vhost-vdpa.c | 13 +++++++------
> > > > > >   1 file changed, 7 insertions(+), 6 deletions(-)
> > > > > >
> > > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > > > > index 33bf3d6409..2421bca347 100644
> > > > > > --- a/net/vhost-vdpa.c
> > > > > > +++ b/net/vhost-vdpa.c
> > > > > > @@ -302,16 +302,16 @@ dma_map_err:
> > > > > >   }
> > > > > >
> > > > > >   /**
> > > > > > - * Copy the guest element into a dedicated buffer suitable to be 
> > > > > > sent to NIC
> > > > > > + * Maps out sg and in buffer into dedicated buffers suitable to be 
> > > > > > sent to NIC
> > > > > >    */
> > > > > > -static bool vhost_vdpa_net_cvq_map_elem(VhostVDPAState *s,
> > > > > > -                                        VirtQueueElement *elem,
> > > > > > -                                        size_t *out_len)
> > > > > > +static bool vhost_vdpa_net_cvq_map_sg(VhostVDPAState *s,
> > > > > > +                                      const struct iovec *out, 
> > > > > > size_t out_num,
> > > > > > +                                      size_t *out_len)
> > > > >
> > > > >
> > > > > This still looks not genreal as there's no guarantee that we won't 
> > > > > have
> > > > > command-in-specific-data. One example is that Ali is working on the
> > > > > virtio-net statistics fetching from the control virtqueue.
> > > > >
> > > > > So it looks to me we'd better have a general bounce_map here that 
> > > > > accepts:
> > > > >
> > > > > 1) out_sg and out_num
> > > > > 2) in_sg and in_num
> > > > >
> > > >
> > > > We don't need to pass in_sg for that: The only useful information is
> > > > its size.
> > >
> > > What if we support stats in the future where it extends the ctrl command:
> > >
> > >          u8 command;
> > >          u8 command-specific-data[];
> > >          u8 ack;
> > > +        u8 command-specific-data-reply[];
> > >
> > > in
> > >
> > > https://lists.oasis-open.org/archives/virtio-dev/202203/msg00000.html
> > >
> >
> > The guest will expose an in descriptor in whatever layout it wants.
> > QEMU reads its layout into a VirtQueueElement in_num and in_sg
> > members, in qemu's (and SVQ) address space.
>
> Yes, but current code did:
>
> 1) map seems based on sg but not unmap, result an non-symmetry API

Ok I get this part now. More on this below.

> 2) NULL is passed to vhost_vdpa_cvq_map_buf()
>
>     ok = vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, NULL, 0,
>                                 sizeof(virtio_net_ctrl_ack),
>                                 s->cvq_cmd_in_buffer, &in_copied, true);
>     if (unlikely(!ok)) {
>         vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, s->cvq_cmd_out_buffer);
>         return false;
>     }
>
> So NULL to be used for iov_to_buf() which is tricky (not even sure it can 
> work).
>

We can add a conditional to be sure.

> And this won't work for commands that have in-data in the future.
>
> >
> > Since we don't want the guest to be able to modify the in buffer
> > maliciously, we offer the device a bounce in buffer. Since the in
> > buffer still contains no information, we only need its size, so we can
> > "allocate and map" an equivalent one for the device and memset to 0.
> > For simplicity, we allocate one page, so no need for iovec
> > complexities.
> >
> > After the device has written in it, we get the written len and verify
> > the information. If VIRTIO_NET_OK, we copy that to the guest's in
> > buffer, using the iov_from_buf right after out: tag at
> > vhost_vdpa_net_handle_ctrl_avail. Instead of copy from the stack
> > "(status, sizeof(status))" variable, we copy from
> > (s->cvq_cmd_in_buffer, written_len).
>
> Another asymmetry, the iov_to_buf() was hidden in map() but the
> iov_from_buf() was exposed to the vhost_vdpa_net_handle_ctrl_avail.
> I'd suggest tweaking the code otherwise it's not easy to read and
> maintain:
>
> map_sg()
> validate_cmd()
> add_to_svq()
> handle_ctrl()
> unmap_sg()
>
> Or since we know the location of the bounce buffer, we can simply do
> bouncing/iov_from_buf() explicitly before map_sg().
>

The main target of creating this helper is to avoid duplicating the
code (especially the fallback on error paths) between the code that
bounces the guest's buffers and the code that sends the commands on
the net device load.

While what you propose is possible, maybe it's better to simply map
the buffer at the beginning and at the end of the device cycle at
last. To do so, I'll bring the prepare() callback from asid and I'll
create a new stop() callback right after calling vhost_dev_stop.

> >
> > Note that this is still not enough for stats. We also need a way to:
> > * Update the virtio-net device model stats. virtio_net_handle_ctrl_iov
> > would try to write the virtio-net stats to in buffer, not to update
> > the device model stat.
> > * Update the stats on the destination. Another ctrl command? Intercept
> > them via svq and simply sum source stats + current device stats? I'd
> > say the second is better as it saves effort to the device, but maybe
> > it's not.
>
> It should be sufficient to maintain a delta for each counter?
>

I think it's the best approach for 2, but we still have to develop for 1st.

Thanks!

> Thanks
>
> >
> > That's why I think this command should be left out at the moment, to
> > do the modifications is not hard but we should agree on how to do them
> > first.
> >
> > > > Since the exposed buffer is an in one, it's enough to expose
> > > > the s->cvq_cmd_in_buffer buffer in full. The caller already knows the
> > > > device will write to it, so the only missing piece is to return the
> > > > written length at vhost_vdpa_net_cvq_add.
> > > >
> > > > Is one page the right buffer size for the in buffer?
> > >
> > > We can start from this.
> > >
> > > > Is it worth
> > > > worrying about it before implementing the stat control command in qemu
> > > > virtio-net?
> > >
> > > If it's not complex, it's better to do that from the beginning,
> > > otherwise the user may be surprised and we need extra work. Anyhow, we
> > > should support at least ack which is an in_sg.
> > >
> > > >
> > > > > In this level, we'd better not have any special care about the in as 
> > > > > the
> > > > > ack.
> > > >
> > > > We need to care about it. If a property has not been updated in the
> > > > vdpa device (it returned VIRTIO_NET_ERR), we must not update the
> > > > device model.
> > >
> > > Yes, but what I meant is at the level of bouncing itself. If we met
> > > VIRTIO_NET_ERR, we should propagate it to the guest as well?
> > >
> >
> > Yes we have, if not the guest thinks the command succeeds. Isn't it?
> >
> > Thanks!
> >
> > > Thanks
> > >
> > > >
> > > > We can move the processing from vhost_vdpa_net_cvq_add to
> > > > vhost_vdpa_net_load and vhost_vdpa_net_handle_ctrl_avail, but the code
> > > > gets duplicated then.
> > > >
> > > > > And we need do bouncing:
> > > > >
> > > > > 1) for out buffer, during map
> > > > > 2) for in buffer during unmap
> > > > >
> > > >
> > > > We can move the copy of the in_buffer to the unmap for sure.
> > > >
> > > > Thanks!
> > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > >   {
> > > > > >       size_t in_copied;
> > > > > >       bool ok;
> > > > > >
> > > > > > -    ok = vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, elem->out_sg, 
> > > > > > elem->out_num,
> > > > > > +    ok = vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, out, out_num,
> > > > > >                                   vhost_vdpa_net_cvq_cmd_len(),
> > > > > >                                   s->cvq_cmd_out_buffer, out_len, 
> > > > > > false);
> > > > > >       if (unlikely(!ok)) {
> > > > > > @@ -435,7 +435,8 @@ static int 
> > > > > > vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
> > > > > >       };
> > > > > >       bool ok;
> > > > > >
> > > > > > -    ok = vhost_vdpa_net_cvq_map_elem(s, elem, 
> > > > > > &dev_buffers[0].iov_len);
> > > > > > +    ok = vhost_vdpa_net_cvq_map_sg(s, elem->out_sg, elem->out_num,
> > > > > > +                                   &dev_buffers[0].iov_len);
> > > > > >       if (unlikely(!ok)) {
> > > > > >           goto out;
> > > > > >       }
> > > > >
> > > >
> > >
> >
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]