qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/2] migration: add support for qatzip compression when do


From: Juan Quintela
Subject: Re: [PATCH v2 2/2] migration: add support for qatzip compression when doing live migration
Date: Tue, 18 Apr 2023 09:49:32 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

"you.chen" <you.chen@intel.com> wrote:
> Add config and logics to use qatzip for page compression, in order to
> support qatzip compression better, we collect multipe pages together
> to do qatzip compression for best performance.
> And we use compile option CONFIG_QATZIP to determine whether should qatzip 
> related code be compiled or not.
>
> Co-developed-by: dennis.wu <dennis.wu@intel.com>
> Signed-off-by: you.chen <you.chen@intel.com>

two questions:
a - why are you using this?  I guess that because it is faster, but you
    are not telling.
b - why are you using it with compression thread instead of multifd
    compression?  probably just another method.


[...]

>      MigrationState *s;
> @@ -4451,6 +4470,8 @@ static Property migration_properties[] = {
>      DEFINE_PROP_UINT8("x-compress-threads", MigrationState,
>                        parameters.compress_threads,
>                        DEFAULT_MIGRATE_COMPRESS_THREAD_COUNT),
> +    DEFINE_PROP_BOOL("x-compress-with-qat", MigrationState,
> +                      parameters.compress_with_qat, false),
>      DEFINE_PROP_BOOL("x-compress-wait-thread", MigrationState,
>                        parameters.compress_wait_thread, true),
>      DEFINE_PROP_UINT8("x-decompress-threads", MigrationState,
> @@ -4580,6 +4601,7 @@ static void migration_instance_init(Object *obj)
>      params->has_compress_level = true;
>      params->has_compress_threads = true;
>      params->has_compress_wait_thread = true;
> +    params->has_compress_with_qat = true;
>      params->has_decompress_threads = true;
>      params->has_throttle_trigger_threshold = true;
>      params->has_cpu_throttle_initial = true;

If you use it as a parameter, this bits are ok, but I still think that
it will work better with a multifd compression method.

> -#define IO_BUF_SIZE 32768
> +#ifdef CONFIG_QATZIP
> +#define IO_BUF_SIZE (512 * KiB)
> +#else
> +#define IO_BUF_SIZE (32 * KiB)
> +#endif

1st part that it is already weird O:-)
I mean, we are supposed to not have bigger buffers.

> +uint8_t *qemu_get_pos(QEMUFile *f)
> +{
> +    return f->buf + f->buf_index;
> +}
> +

I know I shouldn't ask, but why do you need this.
/me looks later in the patch.

> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -117,6 +117,20 @@ static void __attribute__((constructor)) 
> init_cpu_flag(void)
>  
>  XBZRLECacheStats xbzrle_counters;
>  
> +/* define the max page number to compress together */
> +#define MULTI_PAGE_NUM 64

You get 64 pages to compress with just using multifd, nothing else needed.

> +#define COMP_BUF_SIZE (TARGET_PAGE_SIZE *  MULTI_PAGE_NUM * 2)
> +#define DECOMP_BUF_SIZE (TARGET_PAGE_SIZE *  MULTI_PAGE_NUM)
> +
> +typedef struct MultiPageAddr {
> +    /* real pages that will compress together */
> +    uint64_t pages;
> +    /* the last index of the addr*/
> +    uint64_t last_idx;
> +    /* each address might contain contineous pages*/
> +    uint64_t addr[MULTI_PAGE_NUM];
> +} MultiPageAddr;
> +
>  /* used by the search for pages to send */
>  struct PageSearchStatus {
>      /* The migration channel used for a specific host page */
> @@ -127,6 +141,12 @@ struct PageSearchStatus {
>      RAMBlock    *block;
>      /* Current page to search from */
>      unsigned long page;
> +    /*
> +     * multi page search from current page
> +     * for compress together with qatzip
> +     * stream APIs
> +     */
> +    MultiPageAddr mpa;
>      /* Set once we wrap around */
>      bool         complete_round;
>      /* Whether we're sending a host page */
> @@ -506,6 +526,15 @@ struct CompressParam {
>      /* internally used fields */
>      z_stream stream;
>      uint8_t *originbuf;
> +
> +#ifdef CONFIG_QATZIP
> +    /*multi page address for compression*/
> +    MultiPageAddr mpa;
> +    QzSession_T qzsess;
> +    uint8_t *decompbuf;
> +    uint8_t *compbuf;
> +    /* QzStream_T qzstream; */
> +#endif
>  };
>  typedef struct CompressParam CompressParam;
>  
> @@ -518,6 +547,15 @@ struct DecompressParam {
>      uint8_t *compbuf;
>      int len;
>      z_stream stream;
> +    RAMBlock *block;
> +
> +#ifdef CONFIG_QATZIP

Aside.  I would expect to detect at configure time if QATZIP is available.

> +    /* decompress multi pages with qzlib*/
> +    QzSession_T qzsess;
> +    /* QzStream_T qzstream; */
> +    uint8_t *decompbuf; /* buffer after decompress */

The comment is wrong. You uses this variable on do_compress_ram_page,
that has nothing to do with decompression.

> +    MultiPageAddr mpa;   /* destination */
> +#endif
>  };

I am not going looking after this.  We already have a mechanism to look
for several pages.

See multifd_queue_page().

We have to options here:
- multifd_queue_page() is better, you need to generalize it and wrote
  this on top of that.
- yours is better.  Then you need to write multifd one on top of that.

What I don't want is two different mechanisms to collect the next 64
dirty pages. I will even consider good that you _always_ search for
multiple pages, let say 64 that are the ones that we have now in both
multifd and your approach and you do a for loop in case that it is
normal precopy.

Another note, I am not a big far of:

#ifdef CONFIG_QATZIP
            if (migrate_compress_with_qat()) {
                do_compress_ram_page_multiple(param->file, &param->qzsess,
                    param->decompbuf, param->compbuf, block, &param->mpa);
            } else {
#endif
                zero_page = do_compress_ram_page(param->file,
                    &param->stream, block, offset, param->originbuf);
                param->zero_page = zero_page;
#ifdef CONFIG_QATZIP
            }
#endif

Two ifdefs and an if mixed, ouch.

Suggestions:
- Just create a function that calls one or another.
- You add another function call to migration_ops, and you add the proper
  function there during initialization.

Another nitbit:
- You are dropping the zero page optimization, you are just compressing
  it.
- See how it is done on my zero pages on multifd on the list.
  a - we only sent the page address for a zero page (hard to win sending
      zero bytes for a page)
  b - I don't know how QzSession_T works, but I guess that it is going
      to be a loosy equivalent of how zlib works.  Notice that
      compression code does:

do_compress_ram_page() -> qemu_put_compression_data() -> ...

static int qemu_compress_data(z_stream *stream, uint8_t *dest, size_t dest_len,
                              const uint8_t *source, size_t source_len)
{
    int err;

    err = deflateReset(stream);
    if (err != Z_OK) {
        return -1;
    }

    stream->avail_in = source_len;
    stream->next_in = (uint8_t *)source;
    stream->avail_out = dest_len;
    stream->next_out = dest;

    err = deflate(stream, Z_FINISH);
    if (err != Z_STREAM_END) {
        return -1;
    }

    return stream->next_out - dest;
}

I can't see the equivalent of deflateReset on your code (remember that I
don't understand qat).  Rememeber that the number of threads for
compression and decompression can be different, so it is not necessary
that you have the same dicctionaries on destination on the thread that
you are using.  That is one of the reasons why multifd-zlib is much,
much better at compression that compression threads.
a- it uses 64pages at a time
b- as recv thread is paired with the same send thread, we don't need to
   reset the dictionaries, and that makes things much, much better.
   I thought it was a bug the 1st time that I saw that it compressed 64
   pages to less than 64bytes O:-)

Later, Juan.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]