qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 5/5] accel: abort if we fail to load the accelerator plugi


From: Kevin Wolf
Subject: Re: [PATCH v6 5/5] accel: abort if we fail to load the accelerator plugin
Date: Mon, 26 Sep 2022 12:56:39 +0200

Am 26.09.2022 um 09:58 hat Claudio Fontana geschrieben:
> On 9/24/22 14:35, Philippe Mathieu-Daudé via wrote:
> > On 24/9/22 01:21, Claudio Fontana wrote:
> >> if QEMU is configured with modules enabled, it is possible that the
> >> load of an accelerator module will fail.
> >> Abort in this case, relying on module_object_class_by_name to report
> >> the specific load error if any.
> >>
> >> Signed-off-by: Claudio Fontana <cfontana@suse.de>
> >> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> >> ---
> >>   accel/accel-softmmu.c | 8 +++++++-
> >>   1 file changed, 7 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
> >> index 67276e4f52..9fa4849f2c 100644
> >> --- a/accel/accel-softmmu.c
> >> +++ b/accel/accel-softmmu.c
> >> @@ -66,6 +66,7 @@ void accel_init_ops_interfaces(AccelClass *ac)
> >>   {
> >>       const char *ac_name;
> >>       char *ops_name;
> >> +    ObjectClass *oc;
> >>       AccelOpsClass *ops;
> >>   
> >>       ac_name = object_class_get_name(OBJECT_CLASS(ac));
> >> @@ -73,8 +74,13 @@ void accel_init_ops_interfaces(AccelClass *ac)
> >>   
> >>       ops_name = g_strdup_printf("%s" ACCEL_OPS_SUFFIX, ac_name);
> >>       ops = ACCEL_OPS_CLASS(module_object_class_by_name(ops_name));
> >> +    oc = module_object_class_by_name(ops_name);
> >> +    if (!oc) {
> >> +        error_report("fatal: could not load module for type '%s'", 
> >> ops_name);
> >> +        abort();
> > 
> > I still think a coredump won't help at all to figure the problem here: a 
> 
> I can change this from abort to exit(1), the issue I am seeing is, usually 
> when we fail to create or initialize objects
> we seem to be using abort(), the most prominent examples are in qom/object.c:
> 
> static TypeImpl *type_new(const TypeInfo *info)
> {
>     TypeImpl *ti = g_malloc0(sizeof(*ti));
>     int i;
> 
>     g_assert(info->name != NULL);
> 
>     if (type_table_lookup(info->name) != NULL) {
>         fprintf(stderr, "Registering `%s' which already exists\n", 
> info->name);
>         abort();
>     }
> 
> ...
> 
> void object_initialize(void *data, size_t size, const char *typename)
> {
>     TypeImpl *type = type_get_by_name(typename);
> 
> #ifdef CONFIG_MODULES
>     if (!type) {
>         Error *local_err = NULL;
>         int rv = module_load_qom(typename, &local_err);
>         if (rv > 0) {
>             type = type_get_by_name(typename);
>         } else if (rv < 0) {
>             error_report_err(local_err);
>         }
>     }
> #endif
>     if (!type) {
>         error_report("missing object type '%s'", typename);
>         abort();
>     }
> 
>     object_initialize_with_type(data, size, type);
> }
> 
> 
> Do you propose to change only the assert in accel_init_ops_interfaces
> to exit(1)?
> 
> Or the other case as well in the series? (ie hw/core/qdev.c qdev_new()
> ?)
> 
> Do you propose to change this consistently through the codebase
> including the object.c snippets above?

The difference with the snippets above (in the non-module case) is that
calling object_new() with a type that doesn't exist is a bug, it's an
programming error. Calling type_new() twice for the same TypeInfo or for
two TypeInfos with the same name is a programming error, too. abort() is
correct for situations that should never happen in a bug free QEMU.

Not being able to load a module is generally not a bug in QEMU, it's an
error of external origin. So here abort() is not appropriate.

The CONFIG_MODULES code in object_initialize() is problematic because it
doesn't have a way to deal with an error case that can happen without a
bug in QEMU. Without changing the prototype of the function to actually
allow error returns (which I suspect might be a very invasive change),
maybe the best approach is to just make it a fatal error and leave the
code mostly as it is in current master:

#ifdef CONFIG_MODULES
    if (!type) {
        /* Assuming that module_load_qom_one() returns an error if the
         * module doesn't exist */
        module_load_qom_one(typename, &error_fatal);
        type = type_get_by_name(typename);
    }
#endif
    if (!type) {
        error_report("missing object type '%s'", typename);
        abort();
    }

    object_initialize_with_type(data, size, type);

This makes it print an error message and exit(). Which is honestly not
great during runtime because it doesn't properly shut down QEMU, let
alone just fail the operation and keep running, but at least slightly
better than abort().

> > module is missing, we know its name. Anyhow I don't mind much, and this
> > can be cleaned later, so:
> 
> Sure this could be fixed later with a series that tries to use exit()
> vs abort() consistently throughout the codebase when initializing and
> creating objects.

This should mean consistently distinguishing programming errors (i.e.
QEMU bugs) from errors of external origin.

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]