qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [qemu-s390x] [PATCH v6] s390x/cpu: expose the guest crash informatio


From: David Hildenbrand
Subject: Re: [qemu-s390x] [PATCH v6] s390x/cpu: expose the guest crash information
Date: Wed, 7 Feb 2018 14:21:29 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2

On 07.02.2018 13:38, Christian Borntraeger wrote:
> This patch is the s390 implementation of guest crash information,
> similar to commit d187e08dc4 ("i386/cpu: add crash-information QOM
> property") and the related commits. We will detect several crash
> reasons, with the "disabled wait" being the most important one, since
> this is used by all s390 guests as a "panic like" notification.
> 
> Demonstrate these ways with examples as follows.
> 
>   1. crash-information QOM property;
> 
>   Run qemu with -qmp unix:qmp-sock,server, then use utility "qmp-shell"
>   to execute "qom-get" command, and might get the result like,
> 
>   (QEMU) (QEMU) qom-get path=/machine/unattached/device[0] \
>       property=crash-information
>   {"return": {"core": 0, "reason": "disabledwait", "psw-mask": 
> 562956395872256, \
>       "type": "s390", "psw-addr": 1102832}}
> 
>   2. GUEST_PANICKED event reporting;
> 
>   Run qemu with a socket option, and telnet or nc to that,
>   -chardev socket,id=qmp,port=4444,host=localhost,server \
>   -mon chardev=qmp,mode=control,pretty=on \
>   Negotiating the mode by { "execute": "qmp_capabilities" }, and the crash
>   information will be reported on a guest crash event like,
> 
>   {
>     "timestamp": {
>         "seconds": 1518004739,
>         "microseconds": 552563
>     },
>     "event": "GUEST_PANICKED",
>     "data": {
>         "action": "pause",
>         "info": {
>             "core": 0,
>             "psw-addr": 1102832,
>             "reason": "disabledwait",
>             "psw-mask": 562956395872256,
>             "type": "s390"
>         }
>     }
>   }
> 
>   3. log;
> 
>   Run qemu with the parameters: -D <logfile> -d guest_errors, to
>   specify the logfile and log item. The results might be,
> 
>   Guest crashed on cpu 0: disabledwait
>   PSW: 0x0002000180000000 0x000000000010d3f0
> 
> Co-authored-by: Jing Liu <address@hidden>
> Signed-off-by: Christian Borntraeger <address@hidden>
> ---
> V5->V6: - use qapi enum
>       - add core id to the crash parameters
>       - indent fixes
>       - rework debug log message to reuse the enum string
>       - update patch description
> 
>  qapi/run-state.json   | 54 
> +++++++++++++++++++++++++++++++++++++++++++++++++--
>  target/s390x/cpu.c    | 41 ++++++++++++++++++++++++++++++++++++++
>  target/s390x/cpu.h    |  2 ++
>  target/s390x/helper.c |  5 ++++-
>  target/s390x/kvm.c    | 15 +++++++-------
>  vl.c                  | 12 ++++++++++--
>  6 files changed, 116 insertions(+), 13 deletions(-)
> 
> diff --git a/qapi/run-state.json b/qapi/run-state.json
> index bca46a8785..328c86b4bb 100644
> --- a/qapi/run-state.json
> +++ b/qapi/run-state.json
> @@ -320,22 +320,29 @@
>  #
>  # An enumeration of the guest panic information types
>  #
> +# @hyper-v: hyper-v guest panic information type
> +#
> +# @s390: s390 guest panic information type (Since: 2.12)
> +#
>  # Since: 2.9
>  ##
>  { 'enum': 'GuestPanicInformationType',
> -  'data': [ 'hyper-v'] }
> +  'data': [ 'hyper-v', 's390' ] }
>  
>  ##
>  # @GuestPanicInformation:
>  #
>  # Information about a guest panic
>  #
> +# @type: Crash type that defines the hypervisor specific information
> +#
>  # Since: 2.9
>  ##
>  {'union': 'GuestPanicInformation',
>   'base': {'type': 'GuestPanicInformationType'},
>   'discriminator': 'type',
> - 'data': { 'hyper-v': 'GuestPanicInformationHyperV' } }
> + 'data': { 'hyper-v': 'GuestPanicInformationHyperV',
> +           's390': 'GuestPanicInformationS390' } }
>  
>  ##
>  # @GuestPanicInformationHyperV:
> @@ -350,3 +357,46 @@
>             'arg3': 'uint64',
>             'arg4': 'uint64',
>             'arg5': 'uint64' } }
> +
> +##
> +# @S390CrashReason:
> +#
> +# Reason why the CPU is in a crashed state.
> +#
> +# @unknown: no crash reason was set
> +#
> +# @disabledwait: the CPU has entered a disabled wait state
> +#
> +# @extintloop: timer interrupt with new PSW enabled for timer

Is this CPU timer or CKC? Or both?

> +#
> +# @pgmintloop: program interrupt with BAD new PSW
> +#
> +# @opintloop: operation exception interrupt with invalid code at the program
> +#             interrupt new PSW
> +#
> +# Since: 2.12
> +##
> +{ 'enum': 'S390CrashReason',
> +  'data': [ 'unknown',
> +            'disabledwait',
> +            'extintloop',
> +            'pgmintloop',
> +            'opintloop' ] }
> +
> +##
> +# @GuestPanicInformationS390:
> +#
> +# S390 specific guest panic information (PSW)
> +#
> +# @core: core id of the CPU that crashed
> +# @psw-mask: control fields of guest PSW
> +# @psw-addr: guest instruction address
> +# @reason: guest crash reason in human readable form
> +#
> +# Since: 2.12
> +##
> +{'struct': 'GuestPanicInformationS390',
> + 'data': { 'core': 'uint32',
> +           'psw-mask': 'uint64',
> +           'psw-addr': 'uint64',
> +           'reason': 'S390CrashReason' } }
> diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
> index d2e6b9f5c7..9dea65b604 100644
> --- a/target/s390x/cpu.c
> +++ b/target/s390x/cpu.c
> @@ -35,6 +35,8 @@
>  #include "qemu/error-report.h"
>  #include "trace.h"
>  #include "qapi/visitor.h"
> +#include "qapi-visit.h"
> +#include "sysemu/hw_accel.h"
>  #include "exec/exec-all.h"
>  #include "hw/qdev-properties.h"
>  #ifndef CONFIG_USER_ONLY
> @@ -237,6 +239,42 @@ out:
>      error_propagate(errp, err);
>  }
>  
> +static GuestPanicInformation *s390_cpu_get_crash_info(CPUState *cs)
> +{
> +    GuestPanicInformation *panic_info;
> +    S390CPU *cpu = S390_CPU(cs);
> +
> +    cpu_synchronize_state(cs);
> +    panic_info = g_malloc0(sizeof(GuestPanicInformation));
> +
> +    panic_info->type = GUEST_PANIC_INFORMATION_TYPE_S390;
> +    panic_info->u.s390.core = cpu->env.core_id;
> +    panic_info->u.s390.psw_mask = cpu->env.psw.mask;
> +    panic_info->u.s390.psw_addr = cpu->env.psw.addr;
> +    panic_info->u.s390.reason = cpu->env.crash_reason;
> +
> +    return panic_info;
> +}
> +
> +static void s390_cpu_get_crash_info_qom(Object *obj, Visitor *v,
> +                                        const char *name, void *opaque,
> +                                        Error **errp)
> +{
> +    CPUState *cs = CPU(obj);
> +    GuestPanicInformation *panic_info;
> +
> +    if (!cs->crash_occurred) {
> +        error_setg(errp, "No crash occured");
> +        return;
> +    }
> +
> +    panic_info = s390_cpu_get_crash_info(cs);
> +
> +    visit_type_GuestPanicInformation(v, "crash-information", &panic_info,
> +                                     errp);
> +    qapi_free_GuestPanicInformation(panic_info);
> +}
> +
>  static void s390_cpu_initfn(Object *obj)
>  {
>      CPUState *cs = CPU(obj);
> @@ -249,6 +287,8 @@ static void s390_cpu_initfn(Object *obj)
>      cs->env_ptr = env;
>      cs->halted = 1;
>      cs->exception_index = EXCP_HLT;
> +    object_property_add(obj, "crash-information", "GuestPanicInformation",
> +                        s390_cpu_get_crash_info_qom, NULL, NULL, NULL, NULL);
>      s390_cpu_model_register_props(obj);
>  #if !defined(CONFIG_USER_ONLY)
>      qemu_get_timedate(&tm, 0);
> @@ -482,6 +522,7 @@ static void s390_cpu_class_init(ObjectClass *oc, void 
> *data)
>      cc->do_interrupt = s390_cpu_do_interrupt;
>  #endif
>      cc->dump_state = s390_cpu_dump_state;
> +    cc->get_crash_info = s390_cpu_get_crash_info;
>      cc->set_pc = s390_cpu_set_pc;
>      cc->gdb_read_register = s390_cpu_gdb_read_register;
>      cc->gdb_write_register = s390_cpu_gdb_write_register;
> diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
> index a1123ad621..a0feed6781 100644
> --- a/target/s390x/cpu.h
> +++ b/target/s390x/cpu.h
> @@ -102,6 +102,8 @@ struct CPUS390XState {
>  
>      PSW psw;
>  
> +    S390CrashReason crash_reason;
> +
>      uint64_t cc_src;
>      uint64_t cc_dst;
>      uint64_t cc_vr;
> diff --git a/target/s390x/helper.c b/target/s390x/helper.c
> index 35d9741918..50395073bd 100644
> --- a/target/s390x/helper.c
> +++ b/target/s390x/helper.c
> @@ -84,12 +84,15 @@ static inline bool is_special_wait_psw(uint64_t psw_addr)
>  
>  void s390_handle_wait(S390CPU *cpu)
>  {
> +    CPUState *cs = CPU(cpu);
> +
>      if (s390_cpu_halt(cpu) == 0) {
>  #ifndef CONFIG_USER_ONLY
>          if (is_special_wait_psw(cpu->env.psw.addr)) {
>              qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
>          } else {
> -            qemu_system_guest_panicked(NULL);
> +            cpu->env.crash_reason = S390_CRASH_REASON_DISABLEDWAIT;
> +            qemu_system_guest_panicked(cpu_get_crash_info(cs));
>          }
>  #endif
>      }
> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
> index 8736001156..3b3d8764dc 100644
> --- a/target/s390x/kvm.c
> +++ b/target/s390x/kvm.c
> @@ -1568,15 +1568,14 @@ static int handle_instruction(S390CPU *cpu, struct 
> kvm_run *run)
>      return r;
>  }
>  
> -static void unmanageable_intercept(S390CPU *cpu, const char *str, int 
> pswoffset)
> +static void unmanageable_intercept(S390CPU *cpu, S390CrashReason reason,
> +                                   int pswoffset)
>  {
>      CPUState *cs = CPU(cpu);
>  
> -    error_report("Unmanageable %s! CPU%i new PSW: 0x%016lx:%016lx",
> -                 str, cs->cpu_index, ldq_phys(cs->as, cpu->env.psa + 
> pswoffset),
> -                 ldq_phys(cs->as, cpu->env.psa + pswoffset + 8));
>      s390_cpu_halt(cpu);
> -    qemu_system_guest_panicked(NULL);
> +    cpu->env.crash_reason = reason;
> +    qemu_system_guest_panicked(cpu_get_crash_info(cs));
>  }
>  
>  /* try to detect pgm check loops */
> @@ -1606,7 +1605,7 @@ static int handle_oper_loop(S390CPU *cpu, struct 
> kvm_run *run)
>          !(oldpsw.mask & PSW_MASK_PSTATE) &&
>          (newpsw.mask & PSW_MASK_ASC) == (oldpsw.mask & PSW_MASK_ASC) &&
>          (newpsw.mask & PSW_MASK_DAT) == (oldpsw.mask & PSW_MASK_DAT)) {
> -        unmanageable_intercept(cpu, "operation exception loop",
> +        unmanageable_intercept(cpu, S390_CRASH_REASON_OPINTLOOP,
>                                 offsetof(LowCore, program_new_psw));
>          return EXCP_HALTED;
>      }
> @@ -1627,12 +1626,12 @@ static int handle_intercept(S390CPU *cpu)
>              r = handle_instruction(cpu, run);
>              break;
>          case ICPT_PROGRAM:
> -            unmanageable_intercept(cpu, "program interrupt",
> +            unmanageable_intercept(cpu, S390_CRASH_REASON_PGMINTLOOP,
>                                     offsetof(LowCore, program_new_psw));
>              r = EXCP_HALTED;
>              break;
>          case ICPT_EXT_INT:
> -            unmanageable_intercept(cpu, "external interrupt",
> +            unmanageable_intercept(cpu, S390_CRASH_REASON_EXTINTLOOP,
>                                     offsetof(LowCore, external_new_psw));
>              r = EXCP_HALTED;
>              break;
> diff --git a/vl.c b/vl.c
> index e517a8d995..980348f504 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1738,7 +1738,7 @@ void qemu_system_reset(ShutdownCause reason)
>  
>  void qemu_system_guest_panicked(GuestPanicInformation *info)
>  {
> -    qemu_log_mask(LOG_GUEST_ERROR, "Guest crashed\n");
> +    qemu_log_mask(LOG_GUEST_ERROR, "Guest crashed");
>  
>      if (current_cpu) {
>          current_cpu->crash_occurred = true;
> @@ -1754,13 +1754,21 @@ void qemu_system_guest_panicked(GuestPanicInformation 
> *info)
>  
>      if (info) {
>          if (info->type == GUEST_PANIC_INFORMATION_TYPE_HYPER_V) {
> -            qemu_log_mask(LOG_GUEST_ERROR, "HV crash parameters: (%#"PRIx64
> +            qemu_log_mask(LOG_GUEST_ERROR, "\nHV crash parameters: (%#"PRIx64
>                            " %#"PRIx64" %#"PRIx64" %#"PRIx64" %#"PRIx64")\n",
>                            info->u.hyper_v.arg1,
>                            info->u.hyper_v.arg2,
>                            info->u.hyper_v.arg3,
>                            info->u.hyper_v.arg4,
>                            info->u.hyper_v.arg5);
> +        } else if (info->type == GUEST_PANIC_INFORMATION_TYPE_S390) {
> +            qemu_log_mask(LOG_GUEST_ERROR, " on cpu %d: %s\n"
> +                          "PSW: 0x%016" PRIx64 " 0x%016" PRIx64"\n",
> +                          info->u.s390.core,
> +                          qapi_enum_lookup(&S390CrashReason_lookup,
> +                                           info->u.s390.reason),
> +                          info->u.s390.psw_mask,
> +                          info->u.s390.psw_addr);
>          }
>          qapi_free_GuestPanicInformation(info);
>      }
> 

The QAPI interface looks good to me.

(Having "GuestPanicInformation" as a property of CPU looks very strange,
but as you're implementing it just like Hyper-V does right now, this
should be fine.)

-- 

Thanks,

David / dhildenb



reply via email to

[Prev in Thread] Current Thread [Next in Thread]