[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization
From: |
Li, Liang Z |
Subject: |
Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization |
Date: |
Thu, 12 Nov 2015 09:40:18 +0000 |
> >>> I am very surprised about the live migration performance result
> >>> when I use your ' memeqzero4_paolo' instead of these SSE2 Intrinsics
> >>> to check the zero pages.
> >>
> >> What code were you using? Remember I suggested using only unsigned
> >> long checks, like
> >>
> >> unsigned long *p = ...
> >> if (p[0] || p[1] || p[2] || p[3]
> >> || memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
> >> return BUFFER_NOT_ZERO;
> >> else
> >> return BUFFER_ZERO;
> >>
> >
> > I use the following code:
> >
> >
> > bool memeqzero4_paolo(const void *data, size_t length) {
> > ...
> > }
>
> The code you used is very generic and not optimized for the kind of data you
> see during migration, hence the existing code in QEMU fares better.
>
I migrate a 8GB RAM Idle guest, I think most of it's pages are zero pages.
I use your new code:
-------------------------------------------------
unsigned long *p = ...
if (p[0] || p[1] || p[2] || p[3]
|| memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
return BUFFER_NOT_ZERO;
else
return BUFFER_ZERO;
---------------------------------------------------
and the result is almost the same. I also tried the check 8, 16 long data at
the beginning,
same result.
> >>> The total live migration time increased about
> >>> 8%! Not decreased. Although in the unit test your '
> >>> memeqzero4_paolo' has better performance, any idea?
> >>
> >> You only tested the case of zero pages. But real pages usually are
> >> not zero, even if they have a few zero bytes at the beginning. It's
> >> very important to optimize the initial check before the memcmp call.
> >>
> >
> > In the unit test, I only test zero pages too, and the performance of
> 'memeqzero4_paolo' is better.
> > But when merged into QEMU, it caused performance drop. Why?
>
> Because QEMU is not migrating zero pages only.
>
> Paolo
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, (continued)
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/11
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization,
Li, Liang Z <=
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Juan Quintela, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Dr. David Alan Gilbert, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Eric Blake, 2015/11/12