qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [virtio-dev] Re: [PATCH v9 2/5] virtio-balloon: VIRTIO_


From: Wang, Wei W
Subject: Re: [Qemu-devel] [virtio-dev] Re: [PATCH v9 2/5] virtio-balloon: VIRTIO_BALLOON_F_BALLOON_CHUNKS
Date: Wed, 26 Apr 2017 11:03:34 +0000

Hi Michael, could you please give some feedback?

On Monday, April 17, 2017 11:35 AM, Wei Wang wrote:
> On 04/15/2017 05:38 AM, Michael S. Tsirkin wrote:
> > On Fri, Apr 14, 2017 at 04:37:52PM +0800, Wei Wang wrote:
> >> On 04/14/2017 12:34 AM, Michael S. Tsirkin wrote:
> >>> On Thu, Apr 13, 2017 at 05:35:05PM +0800, Wei Wang wrote:
> >>>
> >>> So we don't need the bitmap to talk to host, it is just a data
> >>> structure we chose to maintain lists of pages, right?
> >> Right. bitmap is the way to gather pages to chunk.
> >> It's only needed in the balloon page case.
> >> For the unused page case, we don't need it, since the free page
> >> blocks are already chunks.
> >>
> >>> OK as far as it goes but you need much better isolation for it.
> >>> Build a data structure with APIs such as _init, _cleanup, _add,
> >>> _clear, _find_first, _find_next.
> >>> Completely unrelated to pages, it just maintains bits.
> >>> Then use it here.
> >>>
> >>>
> >>>>    static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> >>>>    module_param(oom_pages, int, S_IRUSR | S_IWUSR);
> >>>>    MODULE_PARM_DESC(oom_pages, "pages to free on OOM"); @@ -50,6
> >>>> +54,10 @@ MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
> >>>>    static struct vfsmount *balloon_mnt;
> >>>>    #endif
> >>>> +/* Types of pages to chunk */
> >>>> +#define PAGE_CHUNK_TYPE_BALLOON 0
> >>>> +
> >>> Doesn't look like you are ever adding more types in this patchset.
> >>> Pls keep code simple, generalize it later.
> >>>
> >> "#define PAGE_CHUNK_TYPE_UNUSED 1" is added in another patch.
> > I would say add the extra code there too. Or maybe we can avoid adding
> > it altogether.
> 
> I'm trying to have the two features( i.e. "balloon pages" and "unused pages")
> decoupled while trying to use common functions to deal with the commonalities.
> That's the reason to define the above macro.
> Without the macro, we will need to have separate functions, for example,
> instead of one "add_one_chunk()", we need to have
> add_one_balloon_page_chunk() and add_one_unused_page_chunk(), and some
> of the implementations will be kind of duplicate in the two functions.
> Probably we can add it when the second feature comes to the code.
> 
> >
> >> Types of page to chunk are treated differently. Different types of
> >> page chunks are sent to the host via different protocols.
> >>
> >> 1) PAGE_CHUNK_TYPE_BALLOON: Ballooned (i.e. inflated/deflated) pages
> >> to chunk.  For the ballooned type, it uses the basic chunk msg format:
> >>
> >> virtio_balloon_page_chunk_hdr +
> >> virtio_balloon_page_chunk * MAX_PAGE_CHUNKS
> >>
> >> 2) PAGE_CHUNK_TYPE_UNUSED: unused pages to chunk. It uses this miscq
> >> msg
> >> format:
> >> miscq_hdr +
> >> virtio_balloon_page_chunk_hdr +
> >> virtio_balloon_page_chunk * MAX_PAGE_CHUNKS
> >>
> >> The chunk msg is actually the payload of the miscq msg.
> >>
> >>
> > So just combine the two message formats and then it'll all be easier?
> >
> 
> Yes, it'll be simple with only one msg format. But the problem I see here is 
> that
> miscq hdr is something necessary for the "unused page"
> usage, but not needed by the "balloon page" usage. To be more precise, struct
> virtio_balloon_miscq_hdr {
>   __le16 cmd;
>   __le16 flags;
> };
> 'cmd' specifies  the command from the miscq (I envision that miscq will be
> further used to handle other possible miscellaneous requests either from the
> host or to the host), so 'cmd' is necessary for the miscq. But the inflateq is
> exclusively used for inflating pages, so adding a command to it would be
> redundant and look a little bewildered there.
> 'flags': We currently use bit 0 of flags to indicate the completion ofa 
> command,
> this is also useful in the "unused page" usage, and not needed by the "balloon
> page" usage.
> >>>> +#define MAX_PAGE_CHUNKS 4096
> >>> This is an order-4 allocation. I'd make it 4095 and then it's an
> >>> order-3 one.
> >> Sounds good, thanks.
> >> I think it would be better to make it 4090. Leave some space for the
> >> hdr as well.
> > And miscq hdr. In fact just let compiler do the math - something like:
> > (8 * PAGE_SIZE - sizeof(hdr)) / sizeof(chunk)
> Agree, thanks.
> 
> >
> > I skimmed explanation of algorithms below but please make sure code
> > speaks for itself and add comments inline to document it.
> > Whenever you answered me inline this is where you want to try to make
> > code clearer and add comments.
> >
> > Also, pls find ways to abstract the data structure so we don't need to
> > deal with its internals all over the code.
> >
> >
> > ....
> >
> >>>>    {
> >>>>          struct scatterlist sg;
> >>>> +        struct virtio_balloon_page_chunk_hdr *hdr;
> >>>> +        void *buf;
> >>>>          unsigned int len;
> >>>> -        sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> >>>> +        switch (type) {
> >>>> +        case PAGE_CHUNK_TYPE_BALLOON:
> >>>> +                hdr = vb->balloon_page_chunk_hdr;
> >>>> +                len = 0;
> >>>> +                break;
> >>>> +        default:
> >>>> +                dev_warn(&vb->vdev->dev, "%s: chunk %d of unknown
> pages\n",
> >>>> +                         __func__, type);
> >>>> +                return;
> >>>> +        }
> >>>> -        /* We should always be able to add one buffer to an empty 
> >>>> queue. */
> >>>> -        virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
> >>>> -        virtqueue_kick(vq);
> >>>> +        buf = (void *)hdr - len;
> >>> Moving back to before the header? How can this make sense?
> >>> It works fine since len is 0, so just buf = hdr.
> >>>
> >> For the unused page chunk case, it follows its own protocol:
> >> miscq_hdr + payload(chunk msg).
> >>   "buf = (void *)hdr - len" moves the buf pointer to the miscq_hdr,
> >> to send the entire miscq msg.
> > Well just pass the correct pointer in.
> >
> OK. The miscq msg is
> {
> miscq_hdr;
> chunk_msg;
> }
> 
> We can probably change the code like this:
> 
> #define CHUNK_TO_MISCQ_MSG(chunk) (chunk - sizeof(struct
> virtio_balloon_miscq_hdr))
> 
> switch (type) {
>          case PAGE_CHUNK_TYPE_BALLOON:
>                  msg_buf = vb->balloon_page_chunk_hdr;
>                  msg_len = sizeof(struct virtio_balloon_page_chunk_hdr) +
>                      nr_chunks * sizeof(struct 
> virtio_balloon_page_chunk_entry);
>                  break;
>          case PAGE_CHUNK_TYPE_UNUSED:
>                  msg_buf = CHUNK_TO_MISCQ_MSG(vb->unused_page_chunk_hdr);
>                  msg_len = sizeof(struct virtio_balloon_miscq_hdr) + 
> sizeof(struct
> virtio_balloon_page_chunk_hdr) +
>                      nr_chunks * sizeof(struct 
> virtio_balloon_page_chunk_entry);
>                  break;
>          default:
>                  dev_warn(&vb->vdev->dev, "%s: chunk %d of unknown pages\n",
>                           __func__, type);
>                  return;
>          }
> 
> 
> 
> >> Please check the patch for implementing the unused page chunk, it
> >> will be clear. If necessary, I can put "buf = (void *)hdr - len" from
> >> that patch.
> > Exactly. And all this pointer math is very messy. Please look for ways
> > to clean it. It's generally easy to fill structures:
> >
> > struct foo *foo = kmalloc(..., sizeof(*foo) + n * sizeof(foo->a[0]));
> > for (i = 0; i < n; ++i)
> >     foo->a[i] = b;
> >
> > this is the kind of code that's easy to understand and it's obvious
> > there are no overflows and no info leaks here.
> >
> OK, will take your suggestion:
> 
> struct virtio_balloon_page_chunk {
>       struct virtio_balloon_page_chunk_hdr hdr;
>       struct virtio_balloon_page_chunk_entry entries[]; };
> 
> 
> >>>> +        len += sizeof(struct virtio_balloon_page_chunk_hdr);
> >>>> +        len += hdr->chunks * sizeof(struct virtio_balloon_page_chunk);
> >>>> +        sg_init_table(&sg, 1);
> >>>> +        sg_set_buf(&sg, buf, len);
> >>>> +        if (!virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL)) {
> >>>> +                virtqueue_kick(vq);
> >>>> +                if (busy_wait)
> >>>> +                        while (!virtqueue_get_buf(vq, &len) &&
> >>>> +                               !virtqueue_is_broken(vq))
> >>>> +                                cpu_relax();
> >>>> +                else
> >>>> +                        wait_event(vb->acked, virtqueue_get_buf(vq, 
> >>>> &len));
> >>>> +                hdr->chunks = 0;
> >>> Why zero it here after device used it? Better to zero before use.
> >> hdr->chunks tells the host how many chunks are there in the payload.
> >> After the device use it, it is ready to zero it.
> > It's rather confusing. Try to pass # of chunks around in some other
> > way.
> 
> Not sure if this was explained clearly - we just let the chunk msg hdr 
> indicates
> the # of chunks in the payload. I think this should be a pretty normal usage, 
> like
> the network UDP hdr, which uses a length field to indicate the packet length.
> 
> >>>> +        }
> >>>> +}
> >>>> +
> >>>> +static void add_one_chunk(struct virtio_balloon *vb, struct virtqueue 
> >>>> *vq,
> >>>> +                          int type, u64 base, u64 size)
> >>> what are the units here? Looks like it's in 4kbyte units?
> >> what is the "unit" you referred to?
> >> This is the function to add one chunk, base pfn and size of the chunk
> >> are supplied to the function.
> >>
> > Are both size and base in bytes then?
> > But you do not send them to host as is, you shift them for some reason
> > before sending them to host.
> >
> Not in bytes actually. base is a base pfn, which is the starting address of 
> the
> continuous pfns. Size is the chunk size, which is the number of continuous 
> pfns.
> 
> They are shifted based on the chunk format we agreed before:
> 
> --------------------------------------------------------
> |                 Base (52 bit)        | Rsvd (12 bit) |
> --------------------------------------------------------
> --------------------------------------------------------
> |                 Size (52 bit)        | Rsvd (12 bit) |
> --------------------------------------------------------
> 
> 
> Here, the pfn will be the balloon page pfn (4KB).In this way, the host doesn't
> need to know PAGE_SIZE of the guest.
> 
> 
> 
> >>>> +                        if (zero >= end)
> >>>> +                                chunk_size = end - one;
> >>>> +                        else
> >>>> +                                chunk_size = zero - one;
> >>>> +
> >>>> +                        if (chunk_size)
> >>>> +                                add_one_chunk(vb, vq,
> PAGE_CHUNK_TYPE_BALLOON,
> >>>> +                                              pfn_start + one, 
> >>>> chunk_size);
> >>> Still not so what does a bit refer to? page or 4kbytes?
> >>> I think it should be a page.
> >> A bit in the bitmap corresponds to a pfn of a balloon page(4KB).
> > That's a waste on systems with large page sizes, and it does not look
> > like you handle that case correctly.
> 
> OK, I will change the bitmap to be PAGE_SIZE based here, instead of
> BALLOON_PAGE_SIZE based. When convert them into chunks, making it based
> on BALLOON_PAGE_SIZE.
> 
> 
> Best,
> Wei
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: address@hidden
> For additional commands, e-mail: address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]