qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: [PATCH 0/3] recover hardware corrupted page by virtio balloon


From: zhenwei pi
Subject: Re: Re: [PATCH 0/3] recover hardware corrupted page by virtio balloon
Date: Fri, 27 May 2022 14:32:52 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1

On 5/27/22 02:37, Peter Xu wrote:
On Wed, May 25, 2022 at 01:16:34PM -0700, Jue Wang wrote:
The hypervisor _must_ emulate poisons identified in guest physical
address space (could be transported from the source VM), this is to
prevent silent data corruption in the guest. With a paravirtual
approach like this patch series, the hypervisor can clear some of the
poisoned HVAs knowing for certain that the guest OS has isolated the
poisoned page. I wonder how much value it provides to the guest if the
guest and workload are _not_ in a pressing need for the extra KB/MB
worth of memory.

I'm curious the same on how unpoisoning could help here.  The reasoning
behind would be great material to be mentioned in the next cover letter.

Shouldn't we consider migrating serious workloads off the host already
where there's a sign of more severe hardware issues, instead?

Thanks,


I'm maintaining 1000,000+ virtual machines, from my experience:
UE is quite unusual and occurs randomly, and I did not hit UE storm case in the past years. The memory also has no obvious performance drop after hitting UE.

I hit several CE storm case, the performance memory drops a lot. But I can't find obvious relationship between UE and CE.

So from the point of my view, to fix the corrupted page for VM seems good enough. And yes, unpoisoning several pages does not help significantly, but it is still a chance to make the virtualization better.

--
zhenwei pi



reply via email to

[Prev in Thread] Current Thread [Next in Thread]