qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 1/8] qmp: Support for querying stats


From: Markus Armbruster
Subject: Re: [PATCH v3 1/8] qmp: Support for querying stats
Date: Mon, 16 May 2022 14:00:16 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Paolo Bonzini <pbonzini@redhat.com> writes:

> From: Mark Kanda <mark.kanda@oracle.com>
>
> Gathering statistics is important for development, for monitoring and
> for performance measurement.  There are tools such as kvm_stat that do
> this and they rely on the _user_ knowing the interesting data points
> rather than the tool (which can treat them as opaque).
>
> The commands introduced in this commit introduce QMP support for
> querying stats; the goal is to take the capabilities of these tools
> and making them available throughout the whole virtualization stack,
> so that one can observe, monitor and measure virtual machines without
> having shell access + root on the host that runs them.
>
> query-stats returns a list of all stats per target type (only VM
> and vCPU to start); future commits add extra options for specifying
> stat names, vCPU qom paths, and providers.  All these are used by the
> HMP command "info stats".  Because of the development usecases around
> statistics, a good HMP interface is important.
>
> query-stats-schemas returns a list of stats included in each target
> type, with an option for specifying the provider.  The concepts in the
> schema are based on the KVM binary stats' own introspection data, just
> translated to QAPI.
>
> There are two reasons to have a separate schema that is not tied to
> the QAPI schema.  The first is the contents of the schemas: the new
> introspection data provides different information than the QAPI data,
> namely unit of measurement, how the numbers are gathered and change
> (peak/instant/cumulative/histogram), and histogram bucket sizes.
> There's really no reason to have this kind of metadata in the QAPI
> introspection schema (except possibly for the unit of measure, but
> there's a very weak justification).
>
> Another reason is the dynamicity of the schema.  The QAPI introspection
> data is very much static; and while QOM is somewhat more dynamic,
> generally we consider that to be a bug rather than a feature these days.
> On the other hand, the statistics that are exposed by QEMU might be
> passed through from another source, such as KVM, and the disadvantages of
> manually updating the QAPI schema for outweight the benefits from vetting
> the statistics and filtering out anything that seems "too unstable".
> Running old QEMU with new kernel is a supported usecase; if old QEMU
> cannot expose statistics from a new kernel, or if a kernel developer
> needs to change QEMU before gathering new info from the new kernel,
> then that is a poor user interface.
>
> The framework provides a method to register callbacks for these QMP
> commands.  Most of the work in fact is done by the callbacks, and a
> large majority of this patch is new QAPI structs and commands.

Nice!

> Examples (with KVM stats):
>
> - Query all VM stats:
>
> { "execute": "query-stats", "arguments" : { "target": "vm" } }
>
> { "return": [
>      { "provider": "kvm",
>        "stats": [
>           { "name": "max_mmu_page_hash_collisions", "value": 0 },
>           { "name": "max_mmu_rmap_size", "value": 0 },
>           { "name": "nx_lpage_splits", "value": 148 },
>           ... ] },
>      { "provider": "xyz",
>        "stats": [ ... ] }
> ] }
>
> - Query all vCPU stats:
>
> { "execute": "query-stats", "arguments" : { "target": "vcpu" } }
>
> { "return": [
>      { "provider": "kvm",
>        "qom_path": "/machine/unattached/device[0]"
>        "stats": [
>           { "name": "guest_mode", "value": 0 },
>           { "name": "directed_yield_successful", "value": 0 },
>           { "name": "directed_yield_attempted", "value": 106 },
>           ... ] },
>      { "provider": "kvm",
>        "qom_path": "/machine/unattached/device[1]"
>        "stats": [
>           { "name": "guest_mode", "value": 0 },
>           { "name": "directed_yield_successful", "value": 0 },
>           { "name": "directed_yield_attempted", "value": 106 },
>           ... ] },
> ] }
>
> - Retrieve the schemas:
>
> { "execute": "query-stats-schemas" }
>
> { "return": [
>     { "provider": "kvm",
>       "target": "vcpu",
>       "stats": [
>          { "name": "guest_mode",
>            "unit": "none",
>            "base": 10,
>            "exponent": 0,
>            "type": "instant" },
>         { "name": "directed_yield_successful",
>            "unit": "none",
>            "base": 10,
>            "exponent": 0,
>            "type": "cumulative" },
>         ... ]
>     },
>     { "provider": "kvm",
>       "target": "vm",
>       "stats": [
>         { "name": "max_mmu_page_hash_collisions",
>            "unit": "none",
>            "base": 10,
>            "exponent": 0,
>            "type": "peak" },
>         ... ]
>     },
>     { "provider": "xyz",
>       "target": "vm",
>       "stats": [ ... ]
>     }
> ] }
>
> Signed-off-by: Mark Kanda <mark.kanda@oracle.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/monitor/stats.h |  33 ++++++
>  monitor/qmp-cmds.c      |  71 +++++++++++++
>  qapi/meson.build        |   1 +
>  qapi/qapi-schema.json   |   1 +
>  qapi/stats.json         | 215 ++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 321 insertions(+)
>  create mode 100644 include/monitor/stats.h
>  create mode 100644 qapi/stats.json
>
> diff --git a/include/monitor/stats.h b/include/monitor/stats.h
> new file mode 100644
> index 0000000000..89552ab06f
> --- /dev/null
> +++ b/include/monitor/stats.h
> @@ -0,0 +1,33 @@
> +/*
> + * Copyright (c) 2022 Oracle and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef STATS_H
> +#define STATS_H
> +
> +#include "qapi/qapi-types-stats.h"
> +
> +typedef void StatRetrieveFunc(StatsResultList **result, StatsTarget target, 
> Error **errp);

Long line.

> +typedef void SchemaRetrieveFunc(StatsSchemaList **result, Error **errp);
> +
> +/*
> + * Register callbacks for the QMP query-stats command.
> + *
> + * @stats_fn: routine to query stats:
> + * @schema_fn: routine to query stat schemas:
> + */
> +void add_stats_callbacks(StatRetrieveFunc *stats_fn,
> +                         SchemaRetrieveFunc *schemas_fn);
> +
> +/*
> + * Helper routines for adding stats entries to the results lists.
> + */
> +void add_stats_entry(StatsResultList **, StatsProvider, const char *id,
> +                     StatsList *stats_list);
> +void add_stats_schema(StatsSchemaList **, StatsProvider, StatsTarget,
> +                      StatsSchemaValueList *);
> +
> +#endif /* STATS_H */
> diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
> index 1ebb89f46c..d83faeca88 100644
> --- a/monitor/qmp-cmds.c
> +++ b/monitor/qmp-cmds.c
> @@ -35,6 +35,7 @@
>  #include "qapi/qapi-commands-control.h"
>  #include "qapi/qapi-commands-machine.h"
>  #include "qapi/qapi-commands-misc.h"
> +#include "qapi/qapi-commands-stats.h"
>  #include "qapi/qapi-commands-ui.h"
>  #include "qapi/type-helpers.h"
>  #include "qapi/qmp/qerror.h"
> @@ -43,6 +44,7 @@
>  #include "hw/acpi/acpi_dev_interface.h"
>  #include "hw/intc/intc.h"
>  #include "hw/rdma/rdma.h"
> +#include "monitor/stats.h"
>  
>  NameInfo *qmp_query_name(Error **errp)
>  {
> @@ -441,3 +443,72 @@ HumanReadableText *qmp_x_query_irq(Error **errp)
>  
>      return human_readable_text_from_str(buf);
>  }
> +
> +typedef struct StatsCallbacks {
> +    StatRetrieveFunc *stats_cb;
> +    SchemaRetrieveFunc *schemas_cb;
> +    QTAILQ_ENTRY(StatsCallbacks) next;
> +} StatsCallbacks;
> +
> +static QTAILQ_HEAD(, StatsCallbacks) stats_callbacks =
> +    QTAILQ_HEAD_INITIALIZER(stats_callbacks);
> +
> +void add_stats_callbacks(StatRetrieveFunc *stats_fn,
> +                         SchemaRetrieveFunc *schemas_fn)
> +{
> +    StatsCallbacks *entry = g_new(StatsCallbacks, 1);
> +    entry->stats_cb = stats_fn;
> +    entry->schemas_cb = schemas_fn;
> +
> +    QTAILQ_INSERT_TAIL(&stats_callbacks, entry, next);
> +}
> +
> +StatsResultList *qmp_query_stats(StatsFilter *filter, Error **errp)
> +{
> +    StatsResultList *stats_results = NULL;
> +    StatsCallbacks *entry;
> +
> +    QTAILQ_FOREACH(entry, &stats_callbacks, next) {
> +        entry->stats_cb(&stats_results, filter->target, errp);

If more than one of these sets an error, then the second error_setg()
will trip error_setv()'s assertion.

What's the callback's contract?

> +    }
> +
> +    return stats_results;
> +}
> +
> +StatsSchemaList *qmp_query_stats_schemas(Error **errp)
> +{
> +    StatsSchemaList *stats_results = NULL;
> +    StatsCallbacks *entry;
> +
> +    QTAILQ_FOREACH(entry, &stats_callbacks, next) {
> +        entry->schemas_cb(&stats_results, errp);

Likewise.

> +    }
> +
> +    return stats_results;
> +}
> +
> +void add_stats_entry(StatsResultList **stats_results, StatsProvider provider,
> +                     const char *qom_path, StatsList *stats_list)
> +{
> +    StatsResult *entry = g_new0(StatsResult, 1);

Suggest blank line between declarations and statements, like in the next
function.

> +    entry->provider = provider;
> +    if (qom_path) {
> +        entry->has_qom_path = true;
> +        entry->qom_path = g_strdup(qom_path);
> +    }
> +    entry->stats = stats_list;
> +
> +    QAPI_LIST_PREPEND(*stats_results, entry);
> +}
> +
> +void add_stats_schema(StatsSchemaList **schema_results,
> +                      StatsProvider provider, StatsTarget target,
> +                      StatsSchemaValueList *stats_list)
> +{
> +    StatsSchema *entry = g_new0(StatsSchema, 1);
> +
> +    entry->provider = provider;
> +    entry->target = target;
> +    entry->stats = stats_list;
> +    QAPI_LIST_PREPEND(*schema_results, entry);
> +}

The order of elements doesn't matter in either list, right?

> diff --git a/qapi/meson.build b/qapi/meson.build
> index 656ef0e039..fd5c93d643 100644
> --- a/qapi/meson.build
> +++ b/qapi/meson.build
> @@ -46,6 +46,7 @@ qapi_all_modules = [
>    'replay',
>    'run-state',
>    'sockets',
> +  'stats',
>    'trace',
>    'transaction',
>    'yank',
> diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
> index 4912b9744e..92d7ecc52c 100644
> --- a/qapi/qapi-schema.json
> +++ b/qapi/qapi-schema.json
> @@ -93,3 +93,4 @@
>  { 'include': 'audio.json' }
>  { 'include': 'acpi.json' }
>  { 'include': 'pci.json' }
> +{ 'include': 'stats.json' }
> diff --git a/qapi/stats.json b/qapi/stats.json
> new file mode 100644
> index 0000000000..f0656a6435
> --- /dev/null
> +++ b/qapi/stats.json
> @@ -0,0 +1,215 @@
> +# -*- Mode: Python -*-
> +# vim: filetype=python
> +#
> +# Copyright (c) 2022 Oracle and/or its affiliates.
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or later.
> +# See the COPYING file in the top-level directory.
> +#
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +
> +##
> +# = Statistics
> +##
> +
> +##
> +# @StatsType:
> +#
> +# Enumeration of statistics types
> +#
> +# @cumulative: stat is cumulative; value can only increase.
> +# @instant: stat is instantaneous; value can increase or decrease.
> +# @peak: stat is the peak value; value can only increase.
> +# @linear-histogram: stat is a linear histogram.
> +# @log2-histogram: stat is a logarithmic histogram, with one bucket
> +#                  for each power of two.
> +#
> +# Since: 7.1
> +##
> +{ 'enum' : 'StatsType',
> +  'data' : [ 'cumulative', 'instant', 'peak', 'linear-histogram', 
> 'log2-histogram' ] }

Long line.

> +
> +##
> +# @StatsUnit:
> +#
> +# Enumeration of unit of measurement for statistics
> +#
> +# @bytes: stat reported in bytes.
> +# @seconds: stat reported in seconds.
> +# @cycles: stat reported in clock cycles.
> +#
> +# Since: 7.1
> +##
> +{ 'enum' : 'StatsUnit',
> +  'data' : [ 'bytes', 'seconds', 'cycles' ] }
> +
> +##
> +# @StatsProvider:
> +#
> +# Enumeration of statistics providers.
> +#
> +# Since: 7.1
> +##
> +{ 'enum': 'StatsProvider',
> +  'data': [ ] }
> +
> +##
> +# @StatsTarget:
> +#
> +# The kinds of objects on which one can request statistics.
> +#
> +# @vm: statistics that apply to the entire virtual machine or
> +#      the entire QEMU process.
> +#
> +# @vcpu: statistics that apply to a single virtual CPU.
> +#
> +# Since: 7.1
> +##
> +{ 'enum': 'StatsTarget',
> +  'data': [ 'vm', 'vcpu' ] }
> +
> +##
> +# @StatsFilter:
> +#
> +# The arguments to the query-stats command; specifies a target for which to
> +# request statistics.
> +#
> +# Since: 7.1
> +##
> +{ 'struct': 'StatsFilter',
> +  'data': { 'target': 'StatsTarget' } }
> +
> +##
> +# @StatsValue:
> +#
> +# @scalar: single unsigned 64-bit integers.
> +# @list: list of unsigned 64-bit integers (used for histograms).
> +#
> +# Since: 7.1
> +##
> +{ 'alternate': 'StatsValue',
> +  'data': { 'scalar': 'uint64',
> +            'list': [ 'uint64' ] } }
> +
> +##
> +# @Stats:
> +#
> +# @name: name of stat.
> +# @value: stat value.
> +#
> +# Since: 7.1
> +##
> +{ 'struct': 'Stats',
> +  'data': { 'name': 'str',
> +            'value' : 'StatsValue' } }
> +
> +##
> +# @StatsResult:
> +#
> +# @provider: provider for this set of statistics.
> +#
> +# @qom-path: Path to the object for which the statistics are returned,
> +#            if the object is exposed in the QOM tree
> +#
> +# @stats: list of statistics.
> +#
> +# Since: 7.1
> +##
> +{ 'struct': 'StatsResult',
> +  'data': { 'provider': 'StatsProvider',
> +            '*qom-path': 'str',
> +            'stats': [ 'Stats' ] } }
> +
> +##
> +# @query-stats:
> +#
> +# Return runtime-collected statistics for objects such as the
> +# VM or its vCPUs.
> +#
> +# The arguments are a StatsFilter and specify the provider and objects
> +# to return statistics about.
> +#
> +# Returns: a list of StatsResult, one for each provider and object
> +#          (e.g., for each vCPU).
> +#
> +# Since: 7.1
> +##
> +{ 'command': 'query-stats',
> +  'data': 'StatsFilter',
> +  'boxed': true,
> +  'returns': [ 'StatsResult' ] }
> +
> +##
> +# @StatsSchemaValue:
> +#
> +# Schema for a single statistic.
> +#
> +# @name: name of the statistic; each element of the schema is uniquely
> +#        identified by a target, a provider (both available in @StatsSchema)
> +#        and the name.
> +#
> +# @type: kind of statistic.
> +#
> +# @unit: basic unit of measure for the statistic; if missing, the statistic
> +#        is a simple number or counter.
> +#
> +# @base: base for the multiple of @unit in which the statistic is measured.
> +#        Only present if @exponent is non-zero; @base and @exponent together
> +#        form a SI prefix (e.g., _nano-_ for ``base=10`` and ``exponent=-9``)
> +#        or IEC binary prefix (e.g. _kibi-_ for ``base=2`` and 
> ``exponent=10``)
> +#
> +# @exponent: exponent for the multiple of @unit in which the statistic is
> +#            expressed, or 0 for the basic unit
> +#
> +# @bucket-size: Present when @type is "linear-histogram", contains the width
> +#               of each bucket of the histogram.
> +#
> +# Since: 7.1
> +##
> +{ 'struct': 'StatsSchemaValue',
> +  'data': { 'name': 'str',
> +            'type': 'StatsType',
> +            '*unit': 'StatsUnit',
> +            '*base': 'int8',
> +            'exponent': 'int16',
> +            '*bucket-size': 'uint32' } }
> +
> +##
> +# @StatsSchema:
> +#
> +# Schema for all available statistics for a provider and target.
> +#
> +# @provider: provider for this set of statistics.
> +#
> +# @target: the kind of object that can be queried through the provider.
> +#
> +# @stats: list of statistics.
> +#
> +# Since: 7.1
> +##
> +{ 'struct': 'StatsSchema',
> +  'data': { 'provider': 'StatsProvider',
> +            'target': 'StatsTarget',
> +            'stats': [ 'StatsSchemaValue' ] } }
> +
> +##
> +# @query-stats-schemas:
> +#
> +# Return the schema for all available runtime-collected statistics.
> +#
> +# Note: runtime-collected statistics and their names fall outside QEMU's
> +# usual deprecation policies.  QEMU will try to keep the set of available
> +# data stable, together with their names, but will not guarantee stability
> +# at all costs; the same is true of providers that source statistics
> +# externally, e.g. from Linux.  For example, if the same value is being
> +# tracked with different names on different architectures or by different
> +# providers, one of them might be renamed.  A statistic might go away if
> +# an algorithm is changed or some code is removed; changing a default might
> +# cause previously useful statistics to always report 0.  Such changes,
> +# however, they are expected to be rare.

Scratch "they".

> +#
> +# Since: 7.1
> +##
> +{ 'command': 'query-stats-schemas',
> +  'data': { },
> +  'returns': [ 'StatsSchema' ] }




reply via email to

[Prev in Thread] Current Thread [Next in Thread]