[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v5 32/45] Page request: Add MIG_RP_CMD_REQ_PAGES
From: |
David Gibson |
Subject: |
Re: [Qemu-devel] [PATCH v5 32/45] Page request: Add MIG_RP_CMD_REQ_PAGES reverse command |
Date: |
Thu, 26 Mar 2015 12:28:39 +1100 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Wed, Mar 25, 2015 at 06:16:45PM +0000, Dr. David Alan Gilbert wrote:
> * David Gibson (address@hidden) wrote:
> > On Wed, Feb 25, 2015 at 04:51:55PM +0000, Dr. David Alan Gilbert (git)
> > wrote:
> > > From: "Dr. David Alan Gilbert" <address@hidden>
> > >
> > > Add MIG_RP_CMD_REQ_PAGES command on Return path for the postcopy
> > > destination to request a page from the source.
> > >
> > > Signed-off-by: Dr. David Alan Gilbert <address@hidden>
> > > ---
> > > include/migration/migration.h | 4 +++
> > > migration/migration.c | 70
> > > +++++++++++++++++++++++++++++++++++++++++++
> > > trace-events | 1 +
> > > 3 files changed, 75 insertions(+)
> > >
> > > diff --git a/include/migration/migration.h b/include/migration/migration.h
> > > index 2c607e7..2c15d63 100644
> > > --- a/include/migration/migration.h
> > > +++ b/include/migration/migration.h
> > > @@ -46,6 +46,8 @@ enum mig_rpcomm_cmd {
> > > MIG_RP_CMD_INVALID = 0, /* Must be 0 */
> > > MIG_RP_CMD_SHUT, /* sibling will not send any more RP
> > > messages */
> > > MIG_RP_CMD_PONG, /* Response to a PING; data (seq: be32 ) */
> > > +
> > > + MIG_RP_CMD_REQ_PAGES, /* data (start: be64, len: be64) */
> > > };
> > >
> > > /* Postcopy page-map-incoming - data about each page on the inbound side
> > > */
> > > @@ -253,6 +255,8 @@ void migrate_send_rp_shut(MigrationIncomingState *mis,
> > > uint32_t value);
> > > void migrate_send_rp_pong(MigrationIncomingState *mis,
> > > uint32_t value);
> > > +void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char*
> > > rbname,
> > > + ram_addr_t start, ram_addr_t len);
> > >
> > > void ram_control_before_iterate(QEMUFile *f, uint64_t flags);
> > > void ram_control_after_iterate(QEMUFile *f, uint64_t flags);
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index bd066f6..2e9d0dd 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -138,6 +138,36 @@ void migrate_send_rp_pong(MigrationIncomingState
> > > *mis,
> > > migrate_send_rp_message(mis, MIG_RP_CMD_PONG, 4, (uint8_t *)&buf);
> > > }
> > >
> > > +/* Request a range of pages from the source VM at the given
> > > + * start address.
> > > + * rbname: Name of the RAMBlock to request the page in, if NULL it's
> > > the same
> > > + * as the last request (a name must have been given previously)
> > > + * Start: Address offset within the RB
> > > + * Len: Length in bytes required - must be a multiple of pagesize
> > > + */
> > > +void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char
> > > *rbname,
> > > + ram_addr_t start, ram_addr_t len)
> > > +{
> > > + uint8_t bufc[16+1+255]; /* start (8 byte), len (8 byte), rbname upto
> > > 256 */
> > > + uint64_t *buf64 = (uint64_t *)bufc;
> > > + size_t msglen = 16; /* start + len */
> > > +
> > > + assert(!(len & 1));
> > > + if (rbname) {
> > > + int rbname_len = strlen(rbname);
> > > + assert(rbname_len < 256);
> > > +
> > > + len |= 1; /* Flag to say we've got a name */
> > > + bufc[msglen++] = rbname_len;
> > > + memcpy(bufc + msglen, rbname, rbname_len);
> > > + msglen += rbname_len;
> > > + }
> > > +
> > > + buf64[0] = cpu_to_be64((uint64_t)start);
> > > + buf64[1] = cpu_to_be64((uint64_t)len);
> > > + migrate_send_rp_message(mis, MIG_RP_CMD_REQ_PAGES, msglen, bufc);
> >
> > So.. what's the reason we actually need ramblock names on the wire,
> > rather than working purely from GPAs?
> >
> > It occurs to me that referencing ramblock names from the wire protocol
> > exposes something that's kind of an internal detail, and may limit our
> > options for reworking the memory subsystem in future.
>
> RAMBlock names are already exposed on the wire in precopy migration in
> the forward direction anyway (see save_page_header), however there
> are a few reasons:
> 1) There's no guarantee that the page you are transmitting is currently
> mapped into the guest. The ACPI tables are never mapped.
> 2) Aliases in GPA are allowed.
>
> The only thing that's unique is a reference to the RAMBlock and an offset
> within it.
> (and yes, it does break when we change RAMBlock names but that's normally
> accidental - it did happen sometime around QEMU 1.6 when the PCI strings
> were accidentally changed with a knock on effect of renaming RAMBlocks).
Ah, right, good point.
> > > +}
> > > +
> > > void qemu_start_incoming_migration(const char *uri, Error **errp)
> > > {
> > > const char *p;
> > > @@ -789,6 +819,17 @@ static void source_return_path_bad(MigrationState *s)
> > > }
> > >
> > > /*
> > > + * Process a request for pages received on the return path,
> > > + * We're allowed to send more than requested (e.g. to round to our page
> > > size)
> > > + * and we don't need to send pages that have already been sent.
> > > + */
> > > +static void migrate_handle_rp_req_pages(MigrationState *ms, const char*
> > > rbname,
> > > + ram_addr_t start, ram_addr_t len)
> > > +{
> > > + trace_migrate_handle_rp_req_pages(start, len);
> > > +}
> > > +
> > > +/*
> > > * Handles messages sent on the return path towards the source VM
> > > *
> > > */
> > > @@ -800,6 +841,8 @@ static void *source_return_path_thread(void *opaque)
> > > const int max_len = 512;
> > > uint8_t buf[max_len];
> > > uint32_t tmp32;
> > > + ram_addr_t start, len;
> > > + char *tmpstr;
> > > int res;
> > >
> > > trace_source_return_path_thread_entry();
> > > @@ -815,6 +858,11 @@ static void *source_return_path_thread(void *opaque)
> > > expected_len = 4;
> > > break;
> > >
> > > + case MIG_RP_CMD_REQ_PAGES:
> > > + /* 16 byte start/len _possibly_ plus an id str */
> > > + expected_len = 16 + 256;
> >
> > Isn't that the maximum length, rather than the minimum or typical length?
>
> I'm just trying to be fairly paranoid prior to reading the header.
> If at this point we're working our way through garbage received off a
> misaligned
> stream then if we fail to spot it in this switch then we end up doing
> a qemu_get_buffer on that bad length. Checking it just for a maximum
> would make it safe against something malicious, but if the destination
> hadn't sent that much data then we'd just block here and never get around
> to reporting an error. This way, for most command types we're
> being pretty careful.
>
>
> > > + break;
> > > +
> > > default:
> > > error_report("RP: Received invalid cmd 0x%04x length 0x%04x",
> > > header_com, header_len);
> > > @@ -860,6 +908,28 @@ static void *source_return_path_thread(void *opaque)
> > > trace_source_return_path_thread_pong(tmp32);
> > > break;
> > >
> > > + case MIG_RP_CMD_REQ_PAGES:
> > > + start = be64_to_cpup((uint64_t *)buf);
> > > + len = be64_to_cpup(((uint64_t *)buf)+1);
> > > + tmpstr = NULL;
> > > + if (len & 1) {
> > > + len -= 1; /* Remove the flag */
> > > + /* Now we expect an idstr */
> > > + tmp32 = buf[16]; /* Length of the following idstr */
> > > + tmpstr = (char *)&buf[17];
> > > + buf[17+tmp32] = '\0';
> > > + expected_len = 16+1+tmp32;
> > > + } else {
> > > + expected_len = 16;
> >
> > Ah.. so expected_len is changed here. But then what was the point of
> > setting it in the earlier switch?
>
> For the upper bound on the qemu_get_buffer.
Ok, I think I'd forgotten exactly how the checks worked from the
earlier patch.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
pgpEFtyKpZpd3.pgp
Description: PGP signature