|
From: | Halil Pasic |
Subject: | Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once |
Date: | Wed, 30 May 2018 17:15:19 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 |
On 05/30/2018 06:47 AM, Michael S. Tsirkin wrote:
On Thu, May 24, 2018 at 12:44:53PM +0800, Peter Xu wrote:There are many error_report()s that can be used in frequently called functions, especially on IO paths. That can be unideal in that malicious guest can try to trigger the error tons of time which might use up the log space on the host (e.g., libvirt can capture the stderr of QEMU and put it persistently onto disk).I think the problem is real enough but I think the API isn't great as it stresses the mechanism. Which fundamentally does not matter - we can print once or 10 times, or whatever. What happens here is a guest bug as opposed to hypervisor bug. So I think a better name would be guest_error.
I don't agree with your argument against the name report_once Michael. In my reading the commit message describes one of use cases for which the infrastructure introduced by this patch is a supposed to be a good fit. But report_once is not restricted to this example. In my previous life in the userspace I had to debug problems where the original error message got log-rotated away because of an onslaught of error messages that were a consequence of the original one, and not very helpful. IMHO raising the issue of guest_error is a very sane thing to do, but it is a different problem. I think guest_error is about how and to whom the error is to be reported. IMHO report the error to the ones that are affected by it and to the ones that can do something about it (e.g. fix it) is a good rule of thumb. The latter may be different for hypervisor and for guest bugs. In my understanding this is really about spamming the log problem. Of course one can try to solve/mitigate the problem at different levels. It could be declared 1) a problem to be solved in the logging library more or less transparently 2) a problem to be solved by the environment and it's admin (e.g. log aggregation, filtering, and rotation) 3) a problem that the client code of the logging library has to explicitly deal with The once and rate_limited are 3). To sum it up guest error or not and once or not are orthogonal problems in my view. Regards, Halil
Internally we can still have something similar to this mechanism. Another idea is to reset these guest error counters on guest reset. Device reset too? I'm not 100% sure as guest can trigger device resets.
[..]
[Prev in Thread] | Current Thread | [Next in Thread] |