[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-discuss] ASan'ed binaries start up very slow under qemu-aarch6
Re: [Qemu-discuss] ASan'ed binaries start up very slow under qemu-aarch64.
Mon, 18 Jul 2016 16:51:21 +0100
(CCing qemu-devel, which is more likely to get developer attention)
On 18 July 2016 at 15:45, Maxim Ostapenko <address@hidden> wrote:
> 1) AddressSanitizer mmaps quite large regions of memory for redzones and
> shadow gap. In particular, for 39-bit AS it mmapes:
> || `[0x1400000000, 0x1fffffffff]` || HighShadow || - 48 Gb
> || `[0x1200000000, 0x13ffffffff]` || ShadowGap || - 8 Gb
> || `[0x1000000000, 0x11ffffffff]` || LowShadow || - 4 Gb
> 2) In QEMU, page_set_flags is called for these ranges. It cuts given range
> to individual pages and sets flags for them. Given the page size is 4 Kb,
> for 8 Gb range we have 2097152 iterations and for 48 Gb 12582912 iterations
> in inner loop. This is obviously a performance bottleneck.
Mmm, the algorithm here is pretty simple and basically assumes the
guest isn't going to be doing enormous allocations like that.
(If the host process doesn't happen to have a suitable big lump of its
VA space free then the mmap will fail anyway.)
> 3) Same issue may happen when ASan tries to read /proc/self/map later in
> page_check_range function, after it already mmaped HighShadow, ShadowGap and
> LowShadow regions.
> Could someone help me, how can I mitigate this performance issue? Do we
> really need to set flags to each page on entire (quite big) memory region?
Well, we do need to do some things:
* we're populating the PageDesc data structure which we later use
to cache generated code
* if we're marking the range as writeable and it wasn't previously
writeable, we need to check whether there's already generated code
anywhere in this memory range and invalidate those translations
This could probably be done in a way that doesn't iterate naively
through every page, though.