qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] Fix Event Viewer errors caused by qemu-ga


From: Sameeh Jubran
Subject: Re: [Qemu-devel] [PATCH] Fix Event Viewer errors caused by qemu-ga
Date: Wed, 22 Mar 2017 10:14:53 +0200

On Tue, Mar 21, 2017 at 6:09 PM, Michael Roth <address@hidden>
wrote:

> Quoting Sameeh Jubran (2017-03-21 05:49:52)
> > When the command "guest-fsfreeze-freeze" is executed it causes
> > the VSS service to log the errors below in the Event Viewer.
> >
> > These errors are caused by two issues in the function "CommitSnapshots"
> in
> > provider.cpp:
> >
> > 1. When VSS_TIMEOUT_MSEC expires the funtion returns E_ABORT. This causes
> > the error #12293.
> >
> > 2. The VSS_TIMEOUT_MSEC value is too big. According to msdn the
> > "Flush & Hold" operation has 10 seconds timeout not configurable, The
> > "CommitSnapshots" is a part of the "Flush & Hold" process and thus any
> > timeout bigger than 10 seconds would cause the error #12298 and anything
> > bigger than 40 seconds causes the error #12340. All this info can be
> found here:
> > https://msdn.microsoft.com/en-us/library/windows/desktop/
> aa384589(v=vs.85).aspx
>
> Not sure how best to deal with this. Technically our CommitSnapshots
> interface is driven by the backup job being run by QGA/QEMU management
> side. If that amount of time exceeds the VSS limits then I think it's
> appropriate for VSS to log the error accordingly. VSS_TIMEOUT_MSEC here
> doesn't actually have too much correlation with the VSS-set timeout,
> IIRC it's specifically picked to exceed both the 10 and 40 second
> timeouts and acts more as a fail-safe timeout.

The timeout was added in #commit: b39297aedfabe9b2c426cd540413be991500da25
There is no point in setting the TIMEOUT for this long as the actual freeze
- Fush and Hold Writes -
is limited to 10 seconds ( not configurable) according to msdn
https://msdn.microsoft.com/en-us/library/windows/desktop/aa384589%28v=vs.85%29.aspx

>
> Are the event logs causing issues? FWIW, on the posix side we also opt
> for gratuitous logging to syslog and such, the idea there being that
> cooperative guests would prefer transparency on how the agent is being
> used.
>
Apparently, these error logs are annoying to some (
https://bugzilla.redhat.com/show_bug.cgi?id=1387125),
moreover I don't think that our implementation to the freeze operation -
which is a workaround in a way -
should log errors even though we know they are false alarm.

>
> That said, I do think error 12293 is unecessary, since IIUC it would
> always be paired with the actual VSS-reported error. So avoiding the
> E_ABORT seems reasonable either way.
>
> >
> > |event id|                           error
>  |
> > * 12293  : Volume Shadow Copy Service error: Error calling a routine on a
> >            Shadow Copy Provider {00000000-0000-0000-0000-000000000000}.
> >        Routine details CommitSnapshots [hr = 0x80004004, Operation
> >        aborted.
> >
> > * 12340  : Volume Shadow Copy Error: VSS waited more than 40 seconds for
> >            all volumes to be flushed.  This caused volume
> >        \\?\Volume{62a171da-32ec-11e4-80b1-806e6f6e6963}\ to timeout
> >        while waiting for the release-writes phase of shadow copy
> >        creation. Trying again when disk activity is lower may solve
> >        this problem.
> >
> > * 12298  : Volume Shadow Copy Service error: The I/O writes cannot be
> held
> >            during the shadow copy creation period on volume
> >            \\?\Volume{62a171d9-32ec-11e4-80b1-806e6f6e6963}\. The volume
> >        index in the shadow copy set is 0. Error details:
> >        Open[0x00000000, The operation completed successfully. ],
> >        Flush[0x00000000, The operation completed successfully.],
> >        Release[0x00000000, The operation completed successfully.],
> >        OnRun[0x80042314, The shadow copy provider timed out while
> >        holding writes to the volume being shadow copied. This is
> >        probably due to excessive activity on the volume by an
> >        application or a system service. Try again later when activity
> >        on the volume is reduced.
> >
> > Signed-off-by: Sameeh Jubran <address@hidden>
> > ---
> >  qga/vss-win32/provider.cpp | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/qga/vss-win32/provider.cpp b/qga/vss-win32/provider.cpp
> > index ef94669..d72f4d4 100644
> > --- a/qga/vss-win32/provider.cpp
> > +++ b/qga/vss-win32/provider.cpp
> > @@ -15,7 +15,7 @@
> >  #include <inc/win2003/vscoordint.h>
> >  #include <inc/win2003/vsprov.h>
> >
> > -#define VSS_TIMEOUT_MSEC (60*1000)
> > +#define VSS_TIMEOUT_MSEC (9 * 1000)
> >
> >  static long g_nComObjsInUse;
> >  HINSTANCE g_hinstDll;
> > @@ -377,7 +377,6 @@ STDMETHODIMP CQGAVssProvider::CommitSnapshots(VSS_ID
> SnapshotSetId)
> >      if (WaitForSingleObject(hEventThaw, VSS_TIMEOUT_MSEC) !=
> WAIT_OBJECT_0) {
> >          /* Send event to qemu-ga to notify the provider is timed out */
> >          SetEvent(hEventTimeout);
> > -        hr = E_ABORT;
> >      }
> >
> >      CloseHandle(hEventThaw);
> > --
> > 2.9.3
> >
>
>


-- 
Respectfully,
*Sameeh Jubran*
*Linkedin <https://il.linkedin.com/pub/sameeh-jubran/87/747/a8a>*
*Software Engineer @ Daynix <http://www.daynix.com>.*


reply via email to

[Prev in Thread] Current Thread [Next in Thread]