[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 0/5] trace: [tcg] Optimize per-vCPU tracing s
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [PATCH v2 0/5] trace: [tcg] Optimize per-vCPU tracing states with separate TB caches |
Date: |
Mon, 26 Sep 2016 14:37:55 +0100 |
User-agent: |
Mutt/1.7.0 (2016-08-17) |
On Thu, Sep 15, 2016 at 05:50:37PM +0200, Lluís Vilanova wrote:
> Avoids generating TCG code to call guest code tracing events in vCPUs that are
> not dynamically tracing that event.
>
> Currently, events with the 'tcg' property always generate TCG code to trace
> that
> event at guest code execution time, when their dynamic tracing state is
> checked.
>
> This series adds a performance optimization where TCG code for events with the
> 'tcg' and 'vcpu' properties is not generated if the event is dynamically
> disabled. This optimization raises two issues:
>
> * An event can be dynamically disabled/enabled after the corresponding TCG
> code
> has been generated (i.e., a new TB with the corresponding code should be
> used).
>
> * Each vCPU can have a different dynamic state for the same event (i.e.,
> tracing
> the memory accesses of only one process pinned to a vCPU).
>
> To handle both issues, this series replicates the shared physical TB cache,
> creating a separate physical TB cache for every combination of event states
> (those with the 'vcpu' and 'tcg' properties). Then, all vCPUs tracing the same
> events will use the same physical TB cache.
>
> Sharing physical TBs makes this very space efficient (only the physical TB
> caches, simple arrays of pointers, are replicated), sharing physical TB caches
> maximizes TB reuse across vCPUs whenever possible, and makes dynamic event
> state
> changes more efficient (simply use a different TB array).
>
> The physical TB cache array is indexed with the vCPU's trace event state
> bitmask. This is simpler and more efficient than emitting TCG code to check if
> an event needs tracing; then we should still move the tracing call code to
> either a cold path (making tracing performance worse), or leave it inlined
> (making non-tracing performance worse).
>
> It is also more efficient than eliding TCG code only when *zero* vCPUs are
> tracing an event, since enabling it on a single vCPU will impact the
> performance
> of all other vCPUs that are not tracing that event.
>
> Signed-off-by: Lluís Vilanova <address@hidden>
> ---
TCG folks?
The design of this patch is more related to TCG than tracing since it
affects TB caching.
Stefan
signature.asc
Description: PGP signature
- [Qemu-devel] [PATCH v2 0/5] trace: [tcg] Optimize per-vCPU tracing states with separate TB caches, Lluís Vilanova, 2016/09/15
- [Qemu-devel] [PATCH v2 1/5] exec: [tcg] Refactor flush of per-CPU virtual TB cache, Lluís Vilanova, 2016/09/15
- [Qemu-devel] [PATCH v2 2/5] exec: [tcg] Use multiple physical TB caches, Lluís Vilanova, 2016/09/15
- [Qemu-devel] [PATCH v2 3/5] exec: [tcg] Switch physical TB cache based on vCPU tracing state, Lluís Vilanova, 2016/09/15
- [Qemu-devel] [PATCH v2 4/5] trace: [tcg] Do not generate TCG code to trace dinamically-disabled events, Lluís Vilanova, 2016/09/15
- [Qemu-devel] [PATCH v2 5/5] trace: [tcg, trivial] Re-align generated code, Lluís Vilanova, 2016/09/15
- Re: [Qemu-devel] [PATCH v2 0/5] trace: [tcg] Optimize per-vCPU tracing states with separate TB caches,
Stefan Hajnoczi <=