qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] hw/virtio/vhost: re-factor vhost-section and allow DIRTY


From: Dr. David Alan Gilbert
Subject: Re: [RFC PATCH] hw/virtio/vhost: re-factor vhost-section and allow DIRTY_MEMORY_CODE
Date: Thu, 4 Jun 2020 14:07:29 +0100
User-agent: Mutt/1.13.4 (2020-02-15)

* Alex Bennée (alex.bennee@linaro.org) wrote:
> 
> Michael S. Tsirkin <mst@redhat.com> writes:
> 
> > On Thu, Jun 04, 2020 at 12:49:17PM +0100, Alex Bennée wrote:
> >> 
> >> Michael S. Tsirkin <mst@redhat.com> writes:
> >> 
> >> > On Thu, Jun 04, 2020 at 12:13:23PM +0100, Alex Bennée wrote:
> >> >> The purpose of vhost_section is to identify RAM regions that need to
> >> >> be made available to a vhost client. However when running under TCG
> >> >> all RAM sections have DIRTY_MEMORY_CODE set which leads to problems
> >> >> down the line. The original comment implies VGA regions are a problem
> >> >> but doesn't explain why vhost has a problem with it.
> >> >> 
> >> >> Re-factor the code so:
> >> >> 
> >> >>   - steps are clearer to follow
> >> >>   - reason for rejection is recorded in the trace point
> >> >>   - we allow DIRTY_MEMORY_CODE when TCG is enabled
> >> >> 
> >> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> >> >> Cc: Michael S. Tsirkin <mst@redhat.com>
> >> >> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >> >> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> >> >> ---
> >> >>  hw/virtio/vhost.c | 46 ++++++++++++++++++++++++++++++++--------------
> >> >>  1 file changed, 32 insertions(+), 14 deletions(-)
> >> >> 
> >> >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >> >> index aff98a0ede5..f81fc87e74c 100644
> >> >> --- a/hw/virtio/vhost.c
> >> >> +++ b/hw/virtio/vhost.c
> >> >> @@ -27,6 +27,7 @@
> >> >>  #include "migration/blocker.h"
> >> >>  #include "migration/qemu-file-types.h"
> >> >>  #include "sysemu/dma.h"
> >> >> +#include "sysemu/tcg.h"
> >> >>  #include "trace.h"
> >> >>  
> >> >>  /* enabled until disconnected backend stabilizes */
> >> >> @@ -403,26 +404,43 @@ static int vhost_verify_ring_mappings(struct 
> >> >> vhost_dev *dev,
> >> >>      return r;
> >> >>  }
> >> >>  
> >> >> +/*
> >> >> + * vhost_section: identify sections needed for vhost access
> >> >> + *
> >> >> + * We only care about RAM sections here (where virtqueue can live). If
> >> >> + * we find one we still allow the backend to potentially filter it out
> >> >> + * of our list.
> >> >> + */
> >> >>  static bool vhost_section(struct vhost_dev *dev, MemoryRegionSection 
> >> >> *section)
> >> >>  {
> >> >> -    bool result;
> >> >> -    bool log_dirty = memory_region_get_dirty_log_mask(section->mr) &
> >> >> -                     ~(1 << DIRTY_MEMORY_MIGRATION);
> >> >> -    result = memory_region_is_ram(section->mr) &&
> >> >> -        !memory_region_is_rom(section->mr);
> >> >> -
> >> >> -    /* Vhost doesn't handle any block which is doing dirty-tracking 
> >> >> other
> >> >> -     * than migration; this typically fires on VGA areas.
> >> >> -     */
> >> >> -    result &= !log_dirty;
> >> >> +    enum { OK = 0, NOT_RAM, DIRTY, FILTERED } result = NOT_RAM;
> >> >
> >> > I'm not sure what does this enum buy us as compared to bool.
> >> 
> >> The only real point of the enum is to give a little more detailed
> >> information to the trace point to expose why a section wasn't included.
> >> In a previous iteration I just had the tracepoint at the bottom before a
> >> return true where all other legs had returned false. We could switch to
> >> just having the tracepoint hit for explicit inclusions?
> >
> > I didn't notice.  Yes, ok more tracepoints IMHO.
> 
> I can simplify to two:
> 
>   trace_vhost_section(mr->name)
>   trace_vhost_reject_section(mr->name, int reason)
> 
> Not sure if it's worth defining a enum outside just for the purposes of
> the trace though. Do we have the concept of per-trace event enum codes?

If you want a 'reason' for the trace, then why not just make
  const char *result

Dave

> >> > Also why force OK to 0?
> >> 
> >> Personal preference where 0 indicates success and !0 indicates failure
> >> of various kinds. Again we can drop if we don't want the information in
> >> the tracepoint.
> >
> > So in that case we need to set all values so people can decode them
> > from the trace. But I think it's best to just have more trace points
> > or drop it from the trace.
> >
> >> > And I prefer an explicit "else result = NOT_RAM" below
> >> > instead of initializing it here.
> >> 
> >> Ok.
> >> 
> >> >
> >> >> +
> >> >> +    if (memory_region_is_ram(section->mr) && 
> >> >> !memory_region_is_rom(section->mr)) {
> >> >> +        uint8_t dirty_mask = 
> >> >> memory_region_get_dirty_log_mask(section->mr);
> >> >> +        uint8_t handled_dirty;
> >> >>  
> >> >> -    if (result && dev->vhost_ops->vhost_backend_mem_section_filter) {
> >> >> -        result &=
> >> >> -            dev->vhost_ops->vhost_backend_mem_section_filter(dev, 
> >> >> section);
> >> >> +        /*
> >> >> +         * Vhost doesn't handle any block which is doing 
> >> >> dirty-tracking other
> >> >> +         * than migration; this typically fires on VGA areas. However
> >> >> +         * for TCG we also do dirty code page tracking which shouldn't
> >> >> +         * get in the way.
> >> >> +         */
> >> >> +        handled_dirty = (1 << DIRTY_MEMORY_MIGRATION);
> >> >> +        if (tcg_enabled()) {
> >> >> +            handled_dirty |= (1 << DIRTY_MEMORY_CODE);
> >> >> +        }
> >> >
> >> > So DIRTY_MEMORY_CODE is only set by TCG right? Thus I'm guessing
> >> > we can just allow this unconditionally.
> >> 
> >> Which actually makes the test:
> >> 
> >>   if (dirty_mask & DIRTY_MEMORY_VGA) {
> >>      .. fail ..
> >>   }
> >> 
> >> which is more in line with the comment although wouldn't fail if we
> >> added additional DIRTY_MEMORY flags. This leads to the question what
> >> exactly is it about DIRTY tracking that vhost doesn't like.
> >
> > vhost does not know how to track writes to specific regions. It can either
> > track all writes to memory (which slows it down quite a bit)
> > or no writes.
> 
> So can vhost interfere with dirty tracking itself in the kernel by
> trapping the writes? I guess there is no way this can happen with
> vhost-user?
> 
> (I wonder what would happen if a vhost-user daemon did an mprotect() on
> RAM from it's shared view?)
> 
> > It never actually *needs* to write to VGA,
> > so we do a hack and just skip these and then if that's the
> > only thing we need to track then we don't need to enable
> > its dirty tracking.
> >
> > I don't really know what is DIRTY_MEMORY_CODE and when it's set.
> 
> We use it softmmu do any pages that have code in them always force the
> slow-path into cputlb for writes to those pages. This allows us to
> detect self-modifying code. The kernel would never get involved but I
> don't think vhost and TCG is compatible anyway. I'm only really
> interested in vhost-user and it's interaction with TCG.
> 
> I'll spin a v2 now.
> 
> -- 
> Alex Bennée
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK




reply via email to

[Prev in Thread] Current Thread [Next in Thread]