qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 32/45] Page request: Add MIG_RP_CMD_REQ_PAGES


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PATCH v5 32/45] Page request: Add MIG_RP_CMD_REQ_PAGES reverse command
Date: Wed, 25 Mar 2015 18:16:45 +0000
User-agent: Mutt/1.5.23 (2014-03-12)

* David Gibson (address@hidden) wrote:
> On Wed, Feb 25, 2015 at 04:51:55PM +0000, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <address@hidden>
> > 
> > Add MIG_RP_CMD_REQ_PAGES command on Return path for the postcopy
> > destination to request a page from the source.
> > 
> > Signed-off-by: Dr. David Alan Gilbert <address@hidden>
> > ---
> >  include/migration/migration.h |  4 +++
> >  migration/migration.c         | 70 
> > +++++++++++++++++++++++++++++++++++++++++++
> >  trace-events                  |  1 +
> >  3 files changed, 75 insertions(+)
> > 
> > diff --git a/include/migration/migration.h b/include/migration/migration.h
> > index 2c607e7..2c15d63 100644
> > --- a/include/migration/migration.h
> > +++ b/include/migration/migration.h
> > @@ -46,6 +46,8 @@ enum mig_rpcomm_cmd {
> >      MIG_RP_CMD_INVALID = 0,  /* Must be 0 */
> >      MIG_RP_CMD_SHUT,         /* sibling will not send any more RP messages 
> > */
> >      MIG_RP_CMD_PONG,         /* Response to a PING; data (seq: be32 ) */
> > +
> > +    MIG_RP_CMD_REQ_PAGES,    /* data (start: be64, len: be64) */
> >  };
> >  
> >  /* Postcopy page-map-incoming - data about each page on the inbound side */
> > @@ -253,6 +255,8 @@ void migrate_send_rp_shut(MigrationIncomingState *mis,
> >                            uint32_t value);
> >  void migrate_send_rp_pong(MigrationIncomingState *mis,
> >                            uint32_t value);
> > +void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* 
> > rbname,
> > +                              ram_addr_t start, ram_addr_t len);
> >  
> >  void ram_control_before_iterate(QEMUFile *f, uint64_t flags);
> >  void ram_control_after_iterate(QEMUFile *f, uint64_t flags);
> > diff --git a/migration/migration.c b/migration/migration.c
> > index bd066f6..2e9d0dd 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -138,6 +138,36 @@ void migrate_send_rp_pong(MigrationIncomingState *mis,
> >      migrate_send_rp_message(mis, MIG_RP_CMD_PONG, 4, (uint8_t *)&buf);
> >  }
> >  
> > +/* Request a range of pages from the source VM at the given
> > + * start address.
> > + *   rbname: Name of the RAMBlock to request the page in, if NULL it's the 
> > same
> > + *           as the last request (a name must have been given previously)
> > + *   Start: Address offset within the RB
> > + *   Len: Length in bytes required - must be a multiple of pagesize
> > + */
> > +void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char 
> > *rbname,
> > +                               ram_addr_t start, ram_addr_t len)
> > +{
> > +    uint8_t bufc[16+1+255]; /* start (8 byte), len (8 byte), rbname upto 
> > 256 */
> > +    uint64_t *buf64 = (uint64_t *)bufc;
> > +    size_t msglen = 16; /* start + len */
> > +
> > +    assert(!(len & 1));
> > +    if (rbname) {
> > +        int rbname_len = strlen(rbname);
> > +        assert(rbname_len < 256);
> > +
> > +        len |= 1; /* Flag to say we've got a name */
> > +        bufc[msglen++] = rbname_len;
> > +        memcpy(bufc + msglen, rbname, rbname_len);
> > +        msglen += rbname_len;
> > +    }
> > +
> > +    buf64[0] = cpu_to_be64((uint64_t)start);
> > +    buf64[1] = cpu_to_be64((uint64_t)len);
> > +    migrate_send_rp_message(mis, MIG_RP_CMD_REQ_PAGES, msglen, bufc);
> 
> So.. what's the reason we actually need ramblock names on the wire,
> rather than working purely from GPAs?
> 
> It occurs to me that referencing ramblock names from the wire protocol
> exposes something that's kind of an internal detail, and may limit our
> options for reworking the memory subsystem in future.

RAMBlock names are already exposed on the wire in precopy migration in
the forward direction anyway (see save_page_header), however there
are a few reasons:
  1) There's no guarantee that the page you are transmitting is currently
     mapped into the guest. The ACPI tables are never mapped.
  2) Aliases in GPA are allowed.

The only thing that's unique is a reference to the RAMBlock and an offset
within it.
(and yes, it does break when we change RAMBlock names but that's normally
accidental - it did happen sometime around QEMU 1.6 when the PCI strings
were accidentally changed with a knock on effect of renaming RAMBlocks).

> > +}
> > +
> >  void qemu_start_incoming_migration(const char *uri, Error **errp)
> >  {
> >      const char *p;
> > @@ -789,6 +819,17 @@ static void source_return_path_bad(MigrationState *s)
> >  }
> >  
> >  /*
> > + * Process a request for pages received on the return path,
> > + * We're allowed to send more than requested (e.g. to round to our page 
> > size)
> > + * and we don't need to send pages that have already been sent.
> > + */
> > +static void migrate_handle_rp_req_pages(MigrationState *ms, const char* 
> > rbname,
> > +                                       ram_addr_t start, ram_addr_t len)
> > +{
> > +    trace_migrate_handle_rp_req_pages(start, len);
> > +}
> > +
> > +/*
> >   * Handles messages sent on the return path towards the source VM
> >   *
> >   */
> > @@ -800,6 +841,8 @@ static void *source_return_path_thread(void *opaque)
> >      const int max_len = 512;
> >      uint8_t buf[max_len];
> >      uint32_t tmp32;
> > +    ram_addr_t start, len;
> > +    char *tmpstr;
> >      int res;
> >  
> >      trace_source_return_path_thread_entry();
> > @@ -815,6 +858,11 @@ static void *source_return_path_thread(void *opaque)
> >              expected_len = 4;
> >              break;
> >  
> > +        case MIG_RP_CMD_REQ_PAGES:
> > +            /* 16 byte start/len _possibly_ plus an id str */
> > +            expected_len = 16 + 256;
> 
> Isn't that the maximum length, rather than the minimum or typical length?

I'm just trying to be fairly paranoid prior to reading the header.
If at this point we're working our way through garbage received off a misaligned
stream then if we fail to spot it in this switch then we end up doing
a qemu_get_buffer on that bad length.  Checking it just for a maximum
would make it safe against something malicious, but if the destination
hadn't sent that much data then we'd just block here and never get around
to reporting an error.  This way, for most command types we're
being pretty careful.


> > +            break;
> > +
> >          default:
> >              error_report("RP: Received invalid cmd 0x%04x length 0x%04x",
> >                      header_com, header_len);
> > @@ -860,6 +908,28 @@ static void *source_return_path_thread(void *opaque)
> >              trace_source_return_path_thread_pong(tmp32);
> >              break;
> >  
> > +        case MIG_RP_CMD_REQ_PAGES:
> > +            start = be64_to_cpup((uint64_t *)buf);
> > +            len = be64_to_cpup(((uint64_t *)buf)+1);
> > +            tmpstr = NULL;
> > +            if (len & 1) {
> > +                len -= 1; /* Remove the flag */
> > +                /* Now we expect an idstr */
> > +                tmp32 = buf[16]; /* Length of the following idstr */
> > +                tmpstr = (char *)&buf[17];
> > +                buf[17+tmp32] = '\0';
> > +                expected_len = 16+1+tmp32;
> > +            } else {
> > +                expected_len = 16;
> 
> Ah.. so expected_len is changed here.  But then what was the point of
> setting it in the earlier switch?

For the upper bound on the qemu_get_buffer.

Dave

> 
> > +            }
> > +            if (header_len != expected_len) {
> > +                error_report("RP: Req_Page with length %d expecting %d",
> > +                        header_len, expected_len);
> > +                source_return_path_bad(ms);
> > +            }
> > +            migrate_handle_rp_req_pages(ms, tmpstr, start, len);
> > +            break;
> > +
> >          default:
> >              /* This shouldn't happen because we should catch this above */
> >              trace_source_return_path_bad_header_com();
> > diff --git a/trace-events b/trace-events
> > index bcbdef8..9bedee4 100644
> > --- a/trace-events
> > +++ b/trace-events
> > @@ -1404,6 +1404,7 @@ migrate_fd_error(void) ""
> >  migrate_fd_cancel(void) ""
> >  migrate_pending(uint64_t size, uint64_t max, uint64_t post, uint64_t 
> > nonpost) "pending size %" PRIu64 " max %" PRIu64 " (post=%" PRIu64 " 
> > nonpost=%" PRIu64 ")"
> >  migrate_send_rp_message(int cmd, uint16_t len) "cmd=%d, len=%d"
> > +migrate_handle_rp_req_pages(size_t start, size_t len) "at %zx for len %zx"
> >  migration_thread_after_loop(void) ""
> >  migration_thread_file_err(void) ""
> >  migration_thread_setup_complete(void) ""
> 
> -- 
> David Gibson                  | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au        | minimalist, thank you.  NOT _the_ 
> _other_
>                               | _way_ _around_!
> http://www.ozlabs.org/~dgibson


--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]