[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v1 RFC Zisslpcfi 6/9] target/riscv: MMU changes for back cfi'
From: |
Deepak Gupta |
Subject: |
Re: [PATCH v1 RFC Zisslpcfi 6/9] target/riscv: MMU changes for back cfi's shadow stack |
Date: |
Wed, 15 Feb 2023 21:43:20 -0800 |
On Wed, Feb 15, 2023 at 6:36 PM LIU Zhiwei <zhiwei_liu@linux.alibaba.com> wrote:
>
>
> On 2023/2/16 7:57, Deepak Gupta wrote:
>
> `On Wed, Feb 15, 2023 at 12:43 AM LIU Zhiwei
> <zhiwei_liu@linux.alibaba.com> wrote:
>
> On 2023/2/9 14:24, Deepak Gupta wrote:
>
> zisslpcfi protects returns(back cfi) using shadow stack. If compiled with
> enabled compiler, function prologs will have `sspush ra` instruction to
> push return address on shadow stack and function epilogs will have
> `sspop t0; sschckra` instruction sequences. `sspop t0` will pop the
> value from top of the shadow stack in t0. `sschckra` will compare `t0`
> and `x1` and if they don't match then hart will raise an illegal
> instruction exception.
>
> Shadow stack is read-only memory except stores can be performed via
> `sspush` and `ssamoswap` instructions. This requires new PTE encoding for
> shadow stack. zisslpcfi uses R=0, W=1, X=0 (an existing reserved encoding
> ) to encode a shadow stack. If backward cfi is not enabled for current
> mode, shadow stack PTE encodings remain reserved. Regular stores to
> shadow stack raise AMO/store access fault. Shadow stack loads/stores on
> regular memory raise load access/store access fault.
>
> This patch creates a new MMU TLB index for shadow stack and flushes TLB
> for shadow stack on privileges changes. This patch doesn't implement
> `Smepmp` related enforcement on shadow stack pmp entry. Reason being qemu
> doesn't have `Smepmp` implementation yet.
>
> I don't know that the Smepmp means here. QEMU has supported the epmp.
>
> https://github.com/riscv/riscv-tee/blob/main/Smepmp/Smepmp.pdf
>
> This specification has been supported. You can enable this extension by -cpu
> rv64,x-epmp=on.
>
> I didn't see the special contents for shadow stack, neither on cfi
> specfication nor the epmp specification.
>
> You should make it clear that how the shadow stack influenced by the pmp
> rules.
Aah cool. I didn't notice that. I was looking for the string `smepmp`.
WIthout paging, shadow stack store/load simply honor pmpcfg read/write
permissions.
`sspush/ssamoswap` will succeed if the selected pmp entry's pmpcfg
allows a write. `sspop/ssamoswap` will
succeed if pmpcfg allows a read.
For M-mode shadow stack, there is a 6 bit field `sspmp` in mseccfg which selects
pmp entry for shadow stack accesses in M mode
M mode protection of shadow stack:
When mseccfg.MML=1, then
- sspush/sspop/ssamoswap must match PMP entry pointed to by `sspmp`
- stores other than `sspush` and `ssamoswap` that match pmp entry
`sspmp` must can access violation
I've to implement mseccfg.MML=1 rule for shadow stack in M mode.
Note: W=1,R=0,X=0 has already been claimed by Smepmp. So we can't use
that for the shadow stack in pmpcfg.
>
> `Smepmp` enforcement should come
> whenever it is implemented.
>
> Signed-off-by: Deepak Gupta <debug@rivosinc.com>
> Signed-off-by: Kip Walker <kip@rivosinc.com>
> ---
> target/riscv/cpu-param.h | 1 +
> target/riscv/cpu.c | 2 +
> target/riscv/cpu.h | 3 ++
> target/riscv/cpu_helper.c | 107 +++++++++++++++++++++++++++++++-------
> 4 files changed, 94 insertions(+), 19 deletions(-)
>
> diff --git a/target/riscv/cpu-param.h b/target/riscv/cpu-param.h
> index ebaf26d26d..a1e379beb7 100644
> --- a/target/riscv/cpu-param.h
> +++ b/target/riscv/cpu-param.h
> @@ -25,6 +25,7 @@
> * - M mode 0b011
> * - U mode HLV/HLVX/HSV 0b100
> * - S mode HLV/HLVX/HSV 0b101
> + * - BCFI shadow stack 0b110
> * - M mode HLV/HLVX/HSV 0b111
> */
> #define NB_MMU_MODES 8
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 6b4e90eb91..14cfb93288 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -584,6 +584,8 @@ static void riscv_cpu_reset_hold(Object *obj)
> }
> /* mmte is supposed to have pm.current hardwired to 1 */
> env->mmte |= (PM_EXT_INITIAL | MMTE_M_PM_CURRENT);
> + /* Initialize ss_priv to current priv. */
> + env->ss_priv = env->priv;
> #endif
> env->xl = riscv_cpu_mxl(env);
> riscv_cpu_update_mask(env);
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index d14ea4f91d..8803ea6426 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -379,6 +379,7 @@ struct CPUArchState {
> uint64_t sstateen[SMSTATEEN_MAX_COUNT];
> target_ulong senvcfg;
> uint64_t henvcfg;
> + target_ulong ss_priv;
> #endif
> target_ulong cur_pmmask;
> target_ulong cur_pmbase;
> @@ -617,6 +618,8 @@ void riscv_cpu_set_fflags(CPURISCVState *env,
> target_ulong);
> #define TB_FLAGS_PRIV_HYP_ACCESS_MASK (1 << 2)
> #define TB_FLAGS_MSTATUS_FS MSTATUS_FS
> #define TB_FLAGS_MSTATUS_VS MSTATUS_VS
> +/* TLB MMU index for shadow stack accesses */
> +#define MMU_IDX_SS_ACCESS 6
>
> #include "exec/cpu-all.h"
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index fc188683c9..63377abc2f 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -657,7 +657,8 @@ void riscv_cpu_set_virt_enabled(CPURISCVState *env, bool
> enable)
>
> bool riscv_cpu_two_stage_lookup(int mmu_idx)
> {
> - return mmu_idx & TB_FLAGS_PRIV_HYP_ACCESS_MASK;
> + return (mmu_idx & TB_FLAGS_PRIV_HYP_ACCESS_MASK) &&
> + (mmu_idx != MMU_IDX_SS_ACCESS);
> }
>
> int riscv_cpu_claim_interrupts(RISCVCPU *cpu, uint64_t interrupts)
> @@ -745,6 +746,38 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong
> newpriv)
> * preemptive context switch. As a result, do both.
> */
> env->load_res = -1;
> +
> + if (cpu_get_bcfien(env) && (env->priv != env->ss_priv)) {
> + /*
> + * If backward CFI is enabled in the new privilege state, the
> + * shadow stack TLB needs to be flushed - unless the most recent
> + * use of the SS TLB was for the same privilege mode.
> + */
> + tlb_flush_by_mmuidx(env_cpu(env), 1 << MMU_IDX_SS_ACCESS);
> + /*
> + * Ignoring env->virt here since currently every time it flips,
> + * all TLBs are flushed anyway.
> + */
> + env->ss_priv = env->priv;
> + }
> +}
> +
> +typedef enum {
> + SSTACK_NO, /* Access is not for a shadow stack instruction */
> + SSTACK_YES, /* Access is for a shadow stack instruction */
> + SSTACK_DC /* Don't care about SS attribute in PMP */
> +} SStackPmpMode;
> +
> +static bool legal_sstack_access(int access_type, bool sstack_inst,
> + bool sstack_attribute)
> +{
> + /*
> + * Read/write/execution permissions are checked as usual. Shadow
> + * stack enforcement is just that (1) instruction type must match
> + * the attribute unless (2) a non-SS load to an SS region.
> + */
> + return (sstack_inst == sstack_attribute) ||
> + ((access_type == MMU_DATA_LOAD) && sstack_attribute);
> }
>
> /*
> @@ -764,7 +797,7 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong
> newpriv)
> static int get_physical_address_pmp(CPURISCVState *env, int *prot,
> target_ulong *tlb_size, hwaddr addr,
> int size, MMUAccessType access_type,
> - int mode)
> + int mode, SStackPmpMode sstack)
>
> Why this parameter if you don't use it?
>
> {
> pmp_priv_t pmp_priv;
> int pmp_index = -1;
> @@ -812,13 +845,16 @@ static int get_physical_address_pmp(CPURISCVState *env,
> int *prot,
> * Second stage is used for hypervisor guest translation
> * @two_stage: Are we going to perform two stage translation
> * @is_debug: Is this access from a debugger or the monitor?
> + * @sstack: Is this access for a shadow stack? Passed by reference so
> + it can be forced to SSTACK_DC when the SS check is completed
> + based on a PTE - so the PMP SS attribute will be ignored.
> */
> static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> int *prot, target_ulong addr,
> target_ulong *fault_pte_addr,
> int access_type, int mmu_idx,
> bool first_stage, bool two_stage,
> - bool is_debug)
> + bool is_debug, SStackPmpMode *sstack)
> {
> /* NOTE: the env->pc value visible here will not be
> * correct, but the value visible to the exception handler
> @@ -830,6 +866,7 @@ static int get_physical_address(CPURISCVState *env,
> hwaddr *physical,
> hwaddr ppn;
> RISCVCPU *cpu = env_archcpu(env);
> int napot_bits = 0;
> + bool is_sstack = (sstack != NULL) && (*sstack == SSTACK_YES);
> target_ulong napot_mask;
>
> /*
> @@ -851,6 +888,8 @@ static int get_physical_address(CPURISCVState *env,
> hwaddr *physical,
> if (get_field(env->mstatus, MSTATUS_MPRV)) {
> mode = get_field(env->mstatus, MSTATUS_MPP);
> }
> + } else if (mmu_idx == MMU_IDX_SS_ACCESS) {
> + mode = env->priv;
> }
>
> if (first_stage == false) {
> @@ -966,7 +1005,7 @@ restart:
> int vbase_ret = get_physical_address(env, &vbase, &vbase_prot,
> base, NULL, MMU_DATA_LOAD,
> mmu_idx, false, true,
> - is_debug);
> + is_debug, NULL);
>
> if (vbase_ret != TRANSLATE_SUCCESS) {
> if (fault_pte_addr) {
> @@ -983,7 +1022,7 @@ restart:
> int pmp_prot;
> int pmp_ret = get_physical_address_pmp(env, &pmp_prot, NULL,
> pte_addr,
> sizeof(target_ulong),
> - MMU_DATA_LOAD, PRV_S);
> + MMU_DATA_LOAD, PRV_S,
> SSTACK_NO);
> if (pmp_ret != TRANSLATE_SUCCESS) {
> return TRANSLATE_PMP_FAIL;
> }
> @@ -1010,6 +1049,18 @@ restart:
> }
> }
>
> + /*
> + * When backward CFI is enabled, the R=0, W=1, X=0 reserved encoding
> + * is used to mark Shadow Stack (SS) pages. If back CFI enabled,
> allow
> + * normal loads on SS pages, regular stores raise store access fault
> + * and avoid hitting the reserved-encoding case. Only shadow stack
> + * stores are allowed on SS pages. Shadow stack loads and stores on
> + * regular memory (non-SS) raise load and store/AMO access fault.
> + * Second stage translations don't participate in Shadow Stack.
> + */
> + bool sstack_page = (cpu_get_bcfien(env) && first_stage &&
> + ((pte & (PTE_R | PTE_W | PTE_X)) == PTE_W));
> +
> if (!(pte & PTE_V)) {
> /* Invalid PTE */
> return TRANSLATE_FAIL;
> @@ -1021,7 +1072,7 @@ restart:
> return TRANSLATE_FAIL;
> }
> base = ppn << PGSHIFT;
> - } else if ((pte & (PTE_R | PTE_W | PTE_X)) == PTE_W) {
> + } else if (((pte & (PTE_R | PTE_W | PTE_X)) == PTE_W) &&
> !sstack_page) {
> /* Reserved leaf PTE flags: PTE_W */
> return TRANSLATE_FAIL;
> } else if ((pte & (PTE_R | PTE_W | PTE_X)) == (PTE_W | PTE_X)) {
> @@ -1038,16 +1089,21 @@ restart:
> } else if (ppn & ((1ULL << ptshift) - 1)) {
> /* Misaligned PPN */
> return TRANSLATE_FAIL;
> - } else if (access_type == MMU_DATA_LOAD && !((pte & PTE_R) ||
> - ((pte & PTE_X) && mxr))) {
> + } else if (access_type == MMU_DATA_LOAD && !(((pte & PTE_R) ||
> + sstack_page) || ((pte & PTE_X) && mxr))) {
> /* Read access check failed */
> return TRANSLATE_FAIL;
> - } else if (access_type == MMU_DATA_STORE && !(pte & PTE_W)) {
> + } else if ((access_type == MMU_DATA_STORE && !is_sstack) &&
> + !(pte & PTE_W)) {
>
> Why limit to !is_sstack? Even is_sstack, we should make sure
>
> (access_type == MMU_DATA_STORE && !(pte & PTE_W)
>
> fails.
>
> As per spec if a shadow stack store happens to a memory which is not a
> shadow stack memory then cpu must raise
> access store fault. This failure here converts to a page fault.
> TRANSLATE_PMP_FAIL is the one which converts to
> access faults. So this check here ensures that legacy behavior is
> maintained i.e.
>
> Fair enough. I think we can just use the more redundant but clearer condition
>
> if ((access_type == MMU_DATA_STORE && !is_sstack && !sstack_page) && !(pte &
> PTE_W))
>
> Thus two new memory access types will be postpone to the legal_sstack_access.
> Or you can use current code, and with a comment.
>
>
>
> Few lines down there is a call to `legal_sstack_access` which actually
> does the logic check of
> "If a regular store happened on shadow stack memory, returns false"
> "If a shadow stack access happened on regular memory, returns false"
> And this check returns PMP_TRANSLATE_FAIL which converts to access faults.
>
> On a very high level, shadow stack accesses (sspush/sspop/ssamoswap)
> to regular memory result in access faults.
> Regular store to shadow stack memory result in store/AMO access fault.
> Regular load to shadow stack memory is allowed.
>
> Let me know if this was clear.
>
> /* Write access check failed */
> return TRANSLATE_FAIL;
> } else if (access_type == MMU_INST_FETCH && !(pte & PTE_X)) {
> /* Fetch access check failed */
> return TRANSLATE_FAIL;
> + } else if (!legal_sstack_access(access_type, is_sstack,
> + sstack_page)) {
> + /* Illegal combo of instruction type and page attribute */
> + return TRANSLATE_PMP_FAIL;
>
> Not sure about this. Does the cfi escape the pmp check?
>
> } else {
> /* if necessary, set accessed and dirty bits. */
> target_ulong updated_pte = pte | PTE_A |
> @@ -1107,18 +1163,27 @@ restart:
> ) << PGSHIFT) | (addr & ~TARGET_PAGE_MASK);
>
> /* set permissions on the TLB entry */
> - if ((pte & PTE_R) || ((pte & PTE_X) && mxr)) {
> + if ((pte & PTE_R) || ((pte & PTE_X) && mxr) || sstack_page) {
>
> I see that we should add the PAGE_READ for sstack_page, such as for a
> no-SS load.
>
> I didn't get this comment. Can you clarify a bit more?
>
> Just a guess why you add sstack_page here. Nothing too much. Ignore it.
>
> Zhiwei
>
> Zhiwei
>
> *prot |= PAGE_READ;
> }
> if ((pte & PTE_X)) {
> *prot |= PAGE_EXEC;
> }
> - /* add write permission on stores or if the page is already
> dirty,
> - so that we TLB miss on later writes to update the dirty bit */
> + /*
> + * add write permission on stores or if the page is already
> dirty,
> + * so that we TLB miss on later writes to update the dirty bit
> + */
> if ((pte & PTE_W) &&
> (access_type == MMU_DATA_STORE || (pte & PTE_D))) {
> *prot |= PAGE_WRITE;
> }
> + if (sstack) {
> + /*
> + * Tell the caller to skip the SS bit in the PMP since we
> + * resolved the attributes via the page table.
> + */
> + *sstack = SSTACK_DC;
> + }
> return TRANSLATE_SUCCESS;
> }
> }
> @@ -1190,13 +1255,13 @@ hwaddr riscv_cpu_get_phys_page_debug(CPUState *cs,
> vaddr addr)
> int mmu_idx = cpu_mmu_index(&cpu->env, false);
>
> if (get_physical_address(env, &phys_addr, &prot, addr, NULL, 0, mmu_idx,
> - true, riscv_cpu_virt_enabled(env), true)) {
> + true, riscv_cpu_virt_enabled(env), true, NULL))
> {
> return -1;
> }
>
> if (riscv_cpu_virt_enabled(env)) {
> if (get_physical_address(env, &phys_addr, &prot, phys_addr, NULL,
> - 0, mmu_idx, false, true, true)) {
> + 0, mmu_idx, false, true, true, NULL)) {
> return -1;
> }
> }
> @@ -1291,6 +1356,8 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address,
> int size,
> bool two_stage_indirect_error = false;
> int ret = TRANSLATE_FAIL;
> int mode = mmu_idx;
> + bool sstack = (mmu_idx == MMU_IDX_SS_ACCESS);
> + SStackPmpMode ssmode = sstack ? SSTACK_YES : SSTACK_NO;
> /* default TLB page size */
> target_ulong tlb_size = TARGET_PAGE_SIZE;
>
> @@ -1318,7 +1385,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address,
> int size,
> /* Two stage lookup */
> ret = get_physical_address(env, &pa, &prot, address,
> &env->guest_phys_fault_addr, access_type,
> - mmu_idx, true, true, false);
> + mmu_idx, true, true, false, &ssmode);
>
> /*
> * A G-stage exception may be triggered during two state lookup.
> @@ -1342,7 +1409,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address,
> int size,
>
> ret = get_physical_address(env, &pa, &prot2, im_address, NULL,
> access_type, mmu_idx, false, true,
> - false);
> + false, NULL);
>
> qemu_log_mask(CPU_LOG_MMU,
> "%s 2nd-stage address=%" VADDR_PRIx " ret %d physical "
> @@ -1353,7 +1420,8 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address,
> int size,
>
> if (ret == TRANSLATE_SUCCESS) {
> ret = get_physical_address_pmp(env, &prot_pmp, &tlb_size,
> pa,
> - size, access_type, mode);
> + size, access_type, mode,
> + SSTACK_NO);
>
> qemu_log_mask(CPU_LOG_MMU,
> "%s PMP address=" HWADDR_FMT_plx " ret %d
> prot"
> @@ -1377,7 +1445,8 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address,
> int size,
> } else {
> /* Single stage lookup */
> ret = get_physical_address(env, &pa, &prot, address, NULL,
> - access_type, mmu_idx, true, false, false);
> + access_type, mmu_idx, true, false,
> + false, &ssmode);
>
> qemu_log_mask(CPU_LOG_MMU,
> "%s address=%" VADDR_PRIx " ret %d physical "
> @@ -1386,7 +1455,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address,
> int size,
>
> if (ret == TRANSLATE_SUCCESS) {
> ret = get_physical_address_pmp(env, &prot_pmp, &tlb_size, pa,
> - size, access_type, mode);
> + size, access_type, mode, ssmode);
>
> qemu_log_mask(CPU_LOG_MMU,
> "%s PMP address=" HWADDR_FMT_plx " ret %d prot"