qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC][PATCH v2 00/11] QEMU Guest Agent: QMP-based host/


From: Michael Roth
Subject: Re: [Qemu-devel] [RFC][PATCH v2 00/11] QEMU Guest Agent: QMP-based host/guest communication (virtagent)
Date: Tue, 03 May 2011 08:53:43 -0500
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10

On 05/03/2011 07:51 AM, Jes Sorensen wrote:
On 04/21/11 15:55, Michael Roth wrote:
Did you do anything with the fsfreeze patches, or were they dropped in
the migration to qapi?

They were pending some changes required on the agent side that weren't
really addressed/doable until this patchset, namely:

1) server-side timeout mechanism to recover from RPCs that can hang
indefinitely or take a really long time (fsfreeze, fopen, etc),
currently it's 30 seconds, may need to bump it to 60 for fsfreeze, or
potentially add an RPC to change the server-side timeout
2) a simple way to temporarily turn off logging so agent doesn't
deadlock itself
3) a way to register a cleanup handler when a timeout occurs.
4) disabling RPCs where proper accounting/logging is required
(guest-open-file, guest-shutdown, etc)

#4 isn't implemented...I think this could be done fairly in-evasively
with something like:

Response important_rpc():
   if (!ga_log("syslog", LEVEL_CRITICAL, "important stuff happening"))
     return ERROR_LOGGING_CURRENTLY_DISABLED

Either that, or maybe simply disable the full command while the freeze
is in progress? I fear we're more likely to miss a case of checking for
logging than we are to miss command disabling?

It should still be very non evasive, maybe just a flag in the struct
declaring the functions marking it as logging-required and if the
no-logging flag is set, the command is made to wait, or return -EAGAIN


Yup when I actually starting dropping it in I realized this was a much better approach. Although, for now I just added something like "if (!logging_enabled) { error_set(QERR_GA_LOGGING_DISABLED); return }" to the start of functions where logging is considered critical, which will result in the user getting an error message about logging so it's not too much of a surprise to them.

The actual dispatch code closely mirrors Anthony's dispatch stuff for QMP so I was hesitant to try to modify it to handle this automatically, since it would require some changes to how the schema parsing/handling is done (would probably need to add a "requires_logging" flag in the schema). Wouldn't take much though. Either way, should be a clean conversion if we decide to go that route.


bool ga_log(log_domain, level, msg):
   if (log_domain == "syslog")
     if (!logging_enabled&&  is_critical(log_level))
       return False;
     syslog(msg, ...)
   else
     if (logging_enabled)
       normallog(msg, ...)
   return True

With that I think we could actually drop the fsfreeze stuff in. Thoughts?

IMHO it is better to disable the commands rather than just logging, but
either way should allow it to drop in.

Kinda agree, but logging seems to be the real dependency. With the server-side timeouts now in place even doing stuff like fopen/fwrite is permitted (it would just timeout if it blocked too long). It's the logging stuff that we don't really have a way to recover from, because it's not run in a thread we can just nuke after a certain amount of time.

Even when we're not frozen, we can't guarantee an fopen/fwrite/fread will succeed, so failures shouldn't be too much of a surprise since they need to be handled anyway. And determining whether or not a command should be marked as executable during a freeze is somewhat nebulous (fopen might work for read-only access, but hang for write access when O_CREATE is set, fwrite might succeed if it doesn't require a flush, etc), plus internal things like logging need to be taken into account.

So, for now at least I think it's a reasonable way to do it.


Sorry for the late reply, been a bit swamped here.

No problem I have your patches in my tree now. They still need a little bit of love and testing but I should be able to get them out on the list shortly.


Cheers,
Jes




reply via email to

[Prev in Thread] Current Thread [Next in Thread]