qemu-riscv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] target/riscv: reduce overhead of MSTATUS_SUM change


From: LIU Zhiwei
Subject: Re: [PATCH] target/riscv: reduce overhead of MSTATUS_SUM change
Date: Wed, 22 Mar 2023 11:16:48 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0


On 2023/3/22 10:47, Wu, Fei wrote:
On 3/22/2023 9:58 AM, LIU Zhiwei wrote:
On 2023/3/22 0:10, Richard Henderson wrote:
On 3/20/23 23:37, fei2.wu@intel.com wrote:
From: Fei Wu <fei2.wu@intel.com>

Kernel needs to access user mode memory e.g. during syscalls, the window
is usually opened up for a very limited time through MSTATUS.SUM, the
overhead is too much if tlb_flush() gets called for every SUM change.
This patch saves addresses accessed when SUM=1, and flushs only these
pages when SUM changes to 0. If the buffer is not large enough to save
all the pages during SUM=1, it will fall back to tlb_flush when
necessary.

The buffer size is set to 4 since in this MSTATUS.SUM open-up window,
most of the time kernel accesses 1 or 2 pages, it's very rare to see
more than 4 pages accessed.

It's not necessary to save/restore these new added status, as
tlb_flush() is always called after restore.

Result of 'pipe 10' from unixbench boosts from 223656 to 1327407. Many
other syscalls benefit a lot from this one too.
This is not the correct approach.

You should be making use of different softmmu indexes, similar to how
ARM uses a separate index for PAN (privileged access never) mode.  If
I read the manual properly, PAN == !SUM.

When you do this, you need no additional flushing.
Hi Fei,

Let's follow Richard's advice.
Yes, I'm thinking about how to do it, and thank Richard for the advice.
My question is:
* If we ensure this separate index (S+SUM) has no overlapping tlb
entries with S-mode (ignore M-mode so far), during SUM=1,
Yes, every mmu index will have their own cache.
we have to
look into both (S+SUM) and S index for kernel address translation, that
should be not desired.
No, we  have to choose one, because each access will be constrained with a mmu idex.

* If all the tlb operations are against (S+SUM) during SUM=1, then
(S+SUM) could contain some duplicated tlb entries of kernel address in S
index, the duplication means extra tlb lookup and fill. Also if we want
to flush tlb entry of specific addr0, we have to flush both index.

This is not the case.

Zhiwei


I will take a look at how arm handles this.

Thanks,
Fei.

Zhiwei


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]