qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] cputlb: don't cpu_abort() if guest tries to exe


From: Mark Cave-Ayland
Subject: Re: [Qemu-devel] [PATCH] cputlb: don't cpu_abort() if guest tries to execute outside RAM or RAM
Date: Mon, 20 Jun 2016 20:16:53 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.8.0

On 20/06/16 18:07, Peter Maydell wrote:

> In get_page_addr_code(), if the guest program counter turns out not to
> be in ROM or RAM, we can't handle executing from it, and we call
> cpu_abort(). This results in the message
>   qemu: fatal: Trying to execute code outside RAM or ROM at 0x08000000
> followed by a guest register dump, and then QEMU dumps core.
> 
> This situation happens in one of two cases:
>  (1) a guest kernel bug, where it jumped off into nowhere
>  (2) a user command line mistake, where they tried to run an image for
>      board A on a QEMU model of board B, or where they didn't provide
>      an image at all, and QEMU executed through a ROM or RAM full of
>      NOP instructions and then fell off the end
> 
> In either case, a core dump of QEMU itself is entirely useless, and
> only confuses users into thinking that this is a bug in QEMU rather
> than a bug in the guest or a problem with their command line. (This
> is a variation on the general idea that we shouldn't assert() on
> something the user can accidentally provoke.)
> 
> Replace the cpu_abort() with something that explains the situation
> a bit better and exits QEMU without dumping core.
> 
> (See LP:1062220 for several examples of confused users.)
> 
> Signed-off-by: Peter Maydell <address@hidden>
> ---
> I've been meaning to do this for a while now...hopefully the
> expanded error message should reduce user confusion.
> 
>  cputlb.c | 39 +++++++++++++++++++++++++++++++++++++--
>  1 file changed, 37 insertions(+), 2 deletions(-)
> 
> diff --git a/cputlb.c b/cputlb.c
> index 23c9b91..079e497 100644
> --- a/cputlb.c
> +++ b/cputlb.c
> @@ -30,6 +30,8 @@
>  #include "exec/ram_addr.h"
>  #include "exec/exec-all.h"
>  #include "tcg/tcg.h"
> +#include "qemu/error-report.h"
> +#include "exec/log.h"
>  
>  /* DEBUG defines, enable DEBUG_TLB_LOG to log to the CPU_LOG_MMU target */
>  /* #define DEBUG_TLB */
> @@ -427,6 +429,39 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
>                              prot, mmu_idx, size);
>  }
>  
> +static void report_bad_exec(CPUState *cpu, target_ulong addr)
> +{
> +    /* Accidentally executing outside RAM or ROM is quite common for
> +     * several user-error situations, so report it in a way that
> +     * makes it clear that this isn't a QEMU bug and provide suggestions
> +     * about what a user could do to fix things.
> +     */
> +    error_report("Trying to execute code outside RAM or ROM at 0x"
> +                 TARGET_FMT_lx, addr);
> +    error_printf("This usually means one of the following happened:\n\n"
> +                 "(1) You told QEMU to execute a kernel for the wrong 
> machine "
> +                 "type, and it crashed on startup (eg trying to run a "
> +                 "raspberry pi kernel on a versatilepb QEMU machine)\n"
> +                 "(2) You didn't give QEMU a kernel or BIOS filename at all, 
> "
> +                 "and QEMU executed a ROM full of no-op instructions until "
> +                 "it fell off the end\n"
> +                 "(3) Your guest kernel has a bug and crashed by jumping "
> +                 "off into nowhere\n\n"
> +                 "This is almost always one of the first two, so check your "
> +                 "command line and that you are using the right type of 
> kernel "
> +                 "for this machine.\n"
> +                 "If you think option (3) is likely then you can try 
> debugging "
> +                 "your guest with the -d debug options; in particular "
> +                 "-d guest_errors will cause the log to include a dump of 
> the "
> +                 "guest register state at this point.\n\n"
> +                 "Execution cannot continue; stopping here.\n\n");
> +
> +    /* Report also to the logs, with more detail including register dump */
> +    qemu_log_mask(LOG_GUEST_ERROR, "qemu: fatal: Trying to execute code "
> +                  "outside RAM or ROM at 0x" TARGET_FMT_lx "\n", addr);
> +    log_cpu_state_mask(LOG_GUEST_ERROR, cpu, CPU_DUMP_FPU | CPU_DUMP_CCOP);
> +}
> +
>  /* NOTE: this function can trigger an exception */
>  /* NOTE2: the returned address is not exactly the physical address: it
>   * is actually a ram_addr_t (in system mode; the user mode emulation
> @@ -455,8 +490,8 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, 
> target_ulong addr)
>          if (cc->do_unassigned_access) {
>              cc->do_unassigned_access(cpu, addr, false, true, 0, 4);
>          } else {
> -            cpu_abort(cpu, "Trying to execute code outside RAM or ROM at 0x"
> -                      TARGET_FMT_lx "\n", addr);
> +            report_bad_exec(cpu, addr);
> +            exit(1);
>          }
>      }
>      p = (void *)((uintptr_t)addr + 
> env1->tlb_table[mmu_idx][page_index].addend);
> 

Excellent! Another use case I see here is with HelenOS/ppc whose
bootloader is fixed at address 0x8000000 (128Mb) and so if you don't
increase the memory above the default then you end up with this panic,
which as you rightly point out is often confusing.


ATB,

Mark.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]