qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 2/5] guest agent: qemu-ga daemon


From: Luiz Capitulino
Subject: Re: [Qemu-devel] [PATCH v5 2/5] guest agent: qemu-ga daemon
Date: Mon, 20 Jun 2011 11:16:52 -0300

On Sun, 19 Jun 2011 14:00:30 -0500
Michael Roth <address@hidden> wrote:

> On 06/17/2011 10:25 PM, Luiz Capitulino wrote:
> > On Fri, 17 Jun 2011 16:25:32 -0500
> > Michael Roth<address@hidden>  wrote:
> >
> >> On 06/17/2011 03:13 PM, Luiz Capitulino wrote:
> >>> On Fri, 17 Jun 2011 14:21:31 -0500
> >>> Michael Roth<address@hidden>   wrote:
> >>>
> >>>> On 06/16/2011 01:42 PM, Luiz Capitulino wrote:
> >>>>> On Tue, 14 Jun 2011 15:06:22 -0500
> >>>>> Michael Roth<address@hidden>    wrote:
> >>>>>
> >>>>>> This is the actual guest daemon, it listens for requests over a
> >>>>>> virtio-serial/isa-serial/unix socket channel and routes them through
> >>>>>> to dispatch routines, and writes the results back to the channel in
> >>>>>> a manner similar to QMP.
> >>>>>>
> >>>>>> A shorthand invocation:
> >>>>>>
> >>>>>>      qemu-ga -d
> >>>>>>
> >>>>>> Is equivalent to:
> >>>>>>
> >>>>>>      qemu-ga -c virtio-serial -p 
> >>>>>> /dev/virtio-ports/org.qemu.guest_agent \
> >>>>>>              -p /var/run/qemu-guest-agent.pid -d
> >>>>>>
> >>>>>> Signed-off-by: Michael Roth<address@hidden>
> >>>>>
> >>>>> Would be nice to have a more complete description, like explaining how 
> >>>>> to
> >>>>> do a simple test.
> >>>>>
> >>>>> And this can't be built...
> >>>>>
> >>>>>> ---
> >>>>>>     qemu-ga.c              |  631 
> >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>>     qga/guest-agent-core.h |    4 +
> >>>>>>     2 files changed, 635 insertions(+), 0 deletions(-)
> >>>>>>     create mode 100644 qemu-ga.c
> >>>>>>
> >>>>>> diff --git a/qemu-ga.c b/qemu-ga.c
> >>>>>> new file mode 100644
> >>>>>> index 0000000..df08d8c
> >>>>>> --- /dev/null
> >>>>>> +++ b/qemu-ga.c
> >>>>>> @@ -0,0 +1,631 @@
> >>>>>> +/*
> >>>>>> + * QEMU Guest Agent
> >>>>>> + *
> >>>>>> + * Copyright IBM Corp. 2011
> >>>>>> + *
> >>>>>> + * Authors:
> >>>>>> + *  Adam Litke<address@hidden>
> >>>>>> + *  Michael Roth<address@hidden>
> >>>>>> + *
> >>>>>> + * This work is licensed under the terms of the GNU GPL, version 2 or 
> >>>>>> later.
> >>>>>> + * See the COPYING file in the top-level directory.
> >>>>>> + */
> >>>>>> +#include<stdlib.h>
> >>>>>> +#include<stdio.h>
> >>>>>> +#include<stdbool.h>
> >>>>>> +#include<glib.h>
> >>>>>> +#include<gio/gio.h>
> >>>>>> +#include<getopt.h>
> >>>>>> +#include<termios.h>
> >>>>>> +#include<syslog.h>
> >>>>>> +#include "qemu_socket.h"
> >>>>>> +#include "json-streamer.h"
> >>>>>> +#include "json-parser.h"
> >>>>>> +#include "qint.h"
> >>>>>> +#include "qjson.h"
> >>>>>> +#include "qga/guest-agent-core.h"
> >>>>>> +#include "qga-qmp-commands.h"
> >>>>>> +#include "module.h"
> >>>>>> +
> >>>>>> +#define QGA_VIRTIO_PATH_DEFAULT 
> >>>>>> "/dev/virtio-ports/org.qemu.guest_agent"
> >>>>>> +#define QGA_PIDFILE_DEFAULT "/var/run/qemu-va.pid"
> >>>>>> +#define QGA_BAUDRATE_DEFAULT B38400 /* for isa-serial channels */
> >>>>>> +#define QGA_TIMEOUT_DEFAULT 30*1000 /* ms */
> >>>>>> +
> >>>>>> +struct GAState {
> >>>>>> +    const char *proxy_path;
> >>>>>
> >>>>> Where is this used?
> >>>>>
> >>>>
> >>>> Nowhere actually. Will remove.
> >>>>
> >>>>>> +    JSONMessageParser parser;
> >>>>>> +    GMainLoop *main_loop;
> >>>>>> +    guint conn_id;
> >>>>>> +    GSocket *conn_sock;
> >>>>>> +    GIOChannel *conn_channel;
> >>>>>> +    guint listen_id;
> >>>>>> +    GSocket *listen_sock;
> >>>>>> +    GIOChannel *listen_channel;
> >>>>>> +    const char *path;
> >>>>>> +    const char *method;
> >>>>>> +    bool virtio; /* fastpath to check for virtio to deal with poll() 
> >>>>>> quirks */
> >>>>>> +    GACommandState *command_state;
> >>>>>> +    GLogLevelFlags log_level;
> >>>>>> +    FILE *log_file;
> >>>>>> +    bool logging_enabled;
> >>>>>> +};
> >>>>>> +
> >>>>>> +static void usage(const char *cmd)
> >>>>>> +{
> >>>>>> +    printf(
> >>>>>> +"Usage: %s -c<channel_opts>\n"
> >>>>>> +"QEMU Guest Agent %s\n"
> >>>>>> +"\n"
> >>>>>> +"  -c, --channel     channel method: one of unix-connect, 
> >>>>>> virtio-serial, or\n"
> >>>>>> +"                    isa-serial (virtio-serial is the default)\n"
> >>>>>> +"  -p, --path        channel path (%s is the default for 
> >>>>>> virtio-serial)\n"
> >>>>>> +"  -l, --logfile     set logfile path, logs to stderr by default\n"
> >>>>>> +"  -f, --pidfile     specify pidfile (default is %s)\n"
> >>>>>> +"  -v, --verbose     log extra debugging information\n"
> >>>>>> +"  -V, --version     print version information and exit\n"
> >>>>>> +"  -d, --daemonize   become a daemon\n"
> >>>>>> +"  -h, --help        display this help and exit\n"
> >>>>>> +"\n"
> >>>>>> +"Report bugs to<address@hidden>\n"
> >>>>>> +    , cmd, QGA_VERSION, QGA_VIRTIO_PATH_DEFAULT, QGA_PIDFILE_DEFAULT);
> >>>>>> +}
> >>>>>> +
> >>>>>> +static void conn_channel_close(GAState *s);
> >>>>>> +
> >>>>>> +static const char *ga_log_level_str(GLogLevelFlags level)
> >>>>>> +{
> >>>>>> +    switch (level&    G_LOG_LEVEL_MASK) {
> >>>>>> +        case G_LOG_LEVEL_ERROR:
> >>>>>> +            return "error";
> >>>>>> +        case G_LOG_LEVEL_CRITICAL:
> >>>>>> +            return "critical";
> >>>>>> +        case G_LOG_LEVEL_WARNING:
> >>>>>> +            return "warning";
> >>>>>> +        case G_LOG_LEVEL_MESSAGE:
> >>>>>> +            return "message";
> >>>>>> +        case G_LOG_LEVEL_INFO:
> >>>>>> +            return "info";
> >>>>>> +        case G_LOG_LEVEL_DEBUG:
> >>>>>> +            return "debug";
> >>>>>> +        default:
> >>>>>> +            return "user";
> >>>>>> +    }
> >>>>>> +}
> >>>>>> +
> >>>>>> +bool ga_logging_enabled(GAState *s)
> >>>>>> +{
> >>>>>> +    return s->logging_enabled;
> >>>>>> +}
> >>>>>> +
> >>>>>> +void ga_disable_logging(GAState *s)
> >>>>>> +{
> >>>>>> +    s->logging_enabled = false;
> >>>>>> +}
> >>>>>> +
> >>>>>> +void ga_enable_logging(GAState *s)
> >>>>>> +{
> >>>>>> +    s->logging_enabled = true;
> >>>>>> +}
> >>>>>
> >>>>> Just to check I got this right, this is needed because of the fsfreeze
> >>>>> command, correct? Isn't it better to have a more descriptive name, like
> >>>>> fsfrozen?
> >>>>>
> >>>>> First I thought this was about a log file. Then I realized this was
> >>>>> probably about letting the user log in, but we don't seem to have this
> >>>>> functionality so I got confused.
> >>>>>
> >>>>
> >>>> Yup, this is currently due to fsfreeze support. From the perspective of
> >>>> the fsfreeze command the explicit "is_frozen" check makes more sense,
> >>>> but the reason it affects other RPCs is because because we can't log
> >>>> stuff in that state. If an RPC shoots itself in the foot by writing to a
> >>>> frozen filesystem we don't really care so much, and up until recently
> >>>> that case was handled with a pthread_cancel timeout mechanism (was
> >>>> removed for the time being, will re-implement using a child process most
> >>>> likely).
> >>>>
> >>>> What we don't want to do is give a host a way to bypass the expectation
> >>>> we set for guest owners by allowing RPCs to be logged. So that's what
> >>>> the check is based on, rather than lower level details like *why*
> >>>> logging is disabled. Also, I'd really like to avoid a precedence where a
> >>>> single RPC can place restrictions on all the others, so the logging
> >>>> aspect seemed general enough that it doesn't necessarily provide that
> >>>> precedence.
> >>>>
> >>>> This has come up a few times without any real consensus. I can probably
> >>>> go either way, but I think the check for logging is easier to set
> >>>> expectations with: "if logging is important from an auditing
> >>>> perspective, don't execute this if logging is disabled". beyond that,
> >>>> same behavior/symantics you'd get with anything that causes a write() on
> >>>> a filesystem in a frozen state: block.
> >>>
> >>> While I understand your arguments, I still find the current behavior
> >>> confusing: you issue a guest-open-file command and get a QgaLoggingFailed
> >>> error. The first question is: what does a log write fail have to do with 
> >>> me
> >>> opening a file? The second question is: what should I do? Try again? Give 
> >>> up?
> >>>
> >>
> >> I agree it's a better solution for the client there, but at the same time:
> >>
> >> guest_privileged_ping():
> >>     if fsfreeze.status == FROZEN:
> >>       syslog("privileged ping, thought you should know")
> >>     else:
> >>       return "error, filesystem frozen"
> >>
> >> Seems silly to the client as well. "Why does a ping command care about
> >> the filesystem state?"
> >>
> >> Inability to log is the true error. That's also the case for the
> >> guest-file-open command.
> >
> > There's a difference. In the guest ping the inability to log is an
> > internal agent error, it's not interesting to the client. We could just
> > ignore the error and reply back (unless we define that guest-privileged-ping
> > requires writing to disk).
> >
> 
> To clarify, these's 2 different types of errors here and their relation 
> to fsfreeze is causing the separation to get munged a bit:
> 
> 1) Logging/accounting errors: even though we try to make it clear 
> wherever possible this solution is based on a "trusted" hypervisor that 
> already has substantial privileges over guests (direct access to images, 
> etc), one of our internal use cases is the ability for customers to 
> properly account for actions initiated by the guest agent. Shutdowns, 
> whats files were read/written, etc.

This is fine.

> For some commands, this accounting is not important. guest-ping, or 
> guest-info, for instance, is uninteresting from a security accounting 
> perspective. So logging is in fact optional and logging failures are 
> silently ignored.
> 
> guest-shutdown, guest-file-open, guest-fsfreeze, however, are important. 
> So the QgaLoggingError we currently report really means "this command 
> requires logging, but logging has been disabled". You're right that this 
> is because of fsfreeze, but it's not because of the filesystem state. 
> guest-file-open on a network mount that wasn't frozen would *still*, by 
> design, report an error due to inability to log.

Completely ignoring the file-system state and considering only what you
say about security, this doesn't make sense to me. Actually, I think it's
a flaw, because a client could open the daemon's log file and write garbage
on it.

But, how does the ability to log protects you from anything? I would understand
it's importance if the user had to provide credentials to use the agent,
but s/he doesn't. It's like you were unable to use a linux box because syslogd
is not running.

This looks like a misfeature to me. If you really want to provide it, then
you could provide a --enable-log-error which, when enabled, would make the
daemon not to execute *any* command if the it's able to log its actions.
Otherwise log errors are just ignored.

> For this case I do see your point about the error not being very helpful 
> to users, which is why I think something like:
> 
> "Guest agent failed to log RPC due to frozen filesystems"
> 
> Is the best way to report it. If we need to fix up the error id to 
> something like QgaLoggingToFrozenFilesystem I'd be fine with that.
> 
> 2) EWOULDBLOCK or I/O errors due to writing to frozen filesystems: this 
> is currently treated as a PEBKAC and not enforced/reported at all. I 
> wouldn't be against disabling guest-file-write as a side-effect of 
> freezing, but that's not totally correct. Writes to non-frozen 
> filesystems like networked or sysfs mounts is technically still okay, 
> for instance. We store path information in guest-file-open and use it to 
> scan against fsfreeze's state info about frozen mounts to handle this a 
> little better.

Yes, I agree with you here.

> 
> Even then you still have RPCs like guest-shutdown that may do a syslog() 
> that would cause the command to freeze, or in the future we may add the 
> ability for the guest agent to execute user/distro-installed scripts 
> that may/may not need to write to the filesystem. Some these might even 
> be intended to run when the filesystem is frozen...cleanup scripts for 
> databases and whatnot for snapshotting is something I've seen come up.
> 
> We can
> 
> a) be very restrictive, which I'm not totally against...my initial 
> inclination was to disable everything except 
> guest-fsfreeze-thaw/guest-fsfreeze-status while frozen, but this has 
> some limitations that aren't as obscure/unlikely as one would hope.
> 
> b) we can be completely unrestrictive about it and give users the same 
> experiences they'd get at a command-line if they, or another user on the 
> system, was messing with fsfreeze.
> 
> c) try to capture some common cases, but make it clear that you can 
> still shoot yourself in the foot using the guest agent on a frozen 
> filesystem, evaluating when to enforce this in code on a case-by-case basis.
> 
> Personally I like b), mostly because it's the least work, but also 
> because it's actually the easiest way to set expectations about fsfreeze.
> 
> > The guest-file-write command, on the other hand, clearly requires to write
> > to disk, so a client would expect a EWOULDBLOCK error.
> >
> > EWOULDBLOCK looks good to me. We could define as general rule that
> > commands don't block, so clients should always expect a EWOULDBLOCK.
> >
> 
> This falls under 2)
> 
> >> There's nothing wrong with opening a read-only
> >> file on a frozen filesystem, it's the fact that we can't log the open
> >> that causes us to bail out..
> >
> > But why do we bail out? Why can't we just ignore the fact we can't log?
> >
> 
> This falls under 1)
> 
> >> So, what if we just munge the 2 to give the user the proper clues to fix
> >> things, and instead return an error like:
> >>
> >> "Guest agent failed to log RPC (is the filesystem frozen?)"?
> >
> > The 'desc' part of an error is a human error descriptor, clients shouldn't
> > parse it. The real error is the error class.
> >
> 
> As mentioned above, we could change the error id itself to capture both 
> the immediate error and the underlying error.
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]