qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15


From: Markus Armbruster
Subject: Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Date: Wed, 10 Jan 2024 14:54:53 +0100
User-agent: Gnus/5.13 (Gnus v5.13)

Fabiano Rosas <farosas@suse.de> writes:

> Markus Armbruster <armbru@redhat.com> writes:
>
>> Peter Xu <peterx@redhat.com> writes:
>>
>>> On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote:
>>>> Hi Fabiano,
>>>> 
>>>> On 9/1/24 21:21, Fabiano Rosas wrote:
>>>> > Cédric Le Goater <clg@kaod.org> writes:
>>>> > 
>>>> > > On 1/9/24 18:40, Fabiano Rosas wrote:
>>>> > > > Cédric Le Goater <clg@kaod.org> writes:
>>>> > > > 
>>>> > > > > On 1/3/24 20:53, Fabiano Rosas wrote:
>>>> > > > > > Philippe Mathieu-Daudé <philmd@linaro.org> writes:
>>>> > > > > > 
>>>> > > > > > > +Peter/Fabiano
>>>> > > > > > > 
>>>> > > > > > > On 2/1/24 17:41, Cédric Le Goater wrote:
>>>> > > > > > > > On 1/2/24 17:15, Philippe Mathieu-Daudé wrote:
>>>> > > > > > > > > Hi Cédric,
>>>> > > > > > > > > 
>>>> > > > > > > > > On 2/1/24 15:55, Cédric Le Goater wrote:
>>>> > > > > > > > > > On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:
>>>> > > > > > > > > > > Hi,
>>>> > > > > > > > > > > 
>>>> > > > > > > > > > > When a MPCore cluster is used, the Cortex-A cores 
>>>> > > > > > > > > > > belong the the
>>>> > > > > > > > > > > cluster container, not to the board/soc layer. This 
>>>> > > > > > > > > > > series move
>>>> > > > > > > > > > > the creation of vCPUs to the MPCore private container.
>>>> > > > > > > > > > > 
>>>> > > > > > > > > > > Doing so we consolidate the QOM model, moving common 
>>>> > > > > > > > > > > code in a
>>>> > > > > > > > > > > central place (abstract MPCore parent).
>>>> > > > > > > > > > 
>>>> > > > > > > > > > Changing the QOM hierarchy has an impact on the state of 
>>>> > > > > > > > > > the machine
>>>> > > > > > > > > > and some fixups are then required to maintain migration 
>>>> > > > > > > > > > compatibility.
>>>> > > > > > > > > > This can become a real headache for KVM machines like 
>>>> > > > > > > > > > virt for which
>>>> > > > > > > > > > migration compatibility is a feature, less for emulated 
>>>> > > > > > > > > > ones.
>>>> > > > > > > > > 
>>>> > > > > > > > > All changes are either moving properties (which are not 
>>>> > > > > > > > > migrated)
>>>> > > > > > > > > or moving non-migrated QOM members (i.e. pointers of 
>>>> > > > > > > > > ARMCPU, which
>>>> > > > > > > > > is still migrated elsewhere). So I don't see any obvious 
>>>> > > > > > > > > migration
>>>> > > > > > > > > problem, but I might be missing something, so I Cc'ed Juan 
>>>> > > > > > > > > :>
>>>> > > > > > 
>>>> > > > > > FWIW, I didn't spot anything problematic either.
>>>> > > > > > 
>>>> > > > > > I've ran this through my migration compatibility series [1] and 
>>>> > > > > > it
>>>> > > > > > doesn't regress aarch64 migration from/to 8.2. The tests use '-M
>>>> > > > > > virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. 
>>>> > > > > > I don't
>>>> > > > > > think we even support migration of anything non-KVM on arm.
>>>> > > > > 
>>>> > > > > it happens we do.
>>>> > > > > 
>>>> > > > 
>>>> > > > Oh, sorry, I didn't mean TCG here. Probably meant to say something 
>>>> > > > like
>>>> > > > non-KVM-capable cpus, as in 32-bit. Nevermind.
>>>> > > 
>>>> > > Theoretically, we should be able to migrate to a TCG guest. Well, this
>>>> > > worked in the past for PPC. When I was doing more KVM related changes,
>>>> > > this was very useful for dev. Also, some machines are partially 
>>>> > > emulated.
>>>> > > Anyhow I agree this is not a strong requirement and we often break it.
>>>> > > Let's focus on KVM only.
>>>> > > 
>>>> > > > > > 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533
>>>> > > > > 
>>>> > > > > yes it depends on the QOM hierarchy and virt seems immune to the 
>>>> > > > > changes.
>>>> > > > > Good.
>>>> > > > > 
>>>> > > > > However, changing the QOM topology clearly breaks migration compat,
>>>> > > > 
>>>> > > > Well, "clearly" is relative =) You've mentioned pseries and aspeed
>>>> > > > already, do you have a pointer to one of those cases were we broke
>>>> > > > migration
>>>> > > 
>>>> > > Regarding pseries, migration compat broke because of 5bc8d26de20c
>>>> > > ("spapr: allocate the ICPState object from under sPAPRCPUCore") which
>>>> > > is similar to the changes proposed by this series, it impacts the QOM
>>>> > > hierarchy. Here is the workaround/fix from Greg : 46f7afa37096
>>>> > > ("spapr: fix migration of ICPState objects from/to older QEMU") which
>>>> > > is quite an headache and this turned out to raise another problem some
>>>> > > months ago ... :/ That's why I sent [1] to prepare removal of old
>>>> > > machines and workarounds becoming a burden.
>>>> > 
>>>> > This feels like something that could be handled by the vmstate code
>>>> > somehow. The state is there, just under a different path.
>>>> 
>>>> What, the QOM path is used in migration? ...
>>>
>>> Hopefully not..
>
> Unfortunately the original fix doesn't mention _what_ actually broke
> with migration. I assumed the QOM path was needed because otherwise I
> don't think the fix makes sense. The thread discussing that patch also
> directly mentions the QOM path:
>
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg450912.html
>
> But I probably misunderstood something while reading that thread.
>
>>>
>>>> 
>>>> See recent discussions on "QOM path stability":
>>>> ZZfYvlmcxBCiaeWE@redhat.com/">https://lore.kernel.org/qemu-devel/ZZfYvlmcxBCiaeWE@redhat.com/
>>>> 87jzojbxt7.fsf@pond.sub.org/">https://lore.kernel.org/qemu-devel/87jzojbxt7.fsf@pond.sub.org/
>>>> 87v883by34.fsf@pond.sub.org/">https://lore.kernel.org/qemu-devel/87v883by34.fsf@pond.sub.org/
>>>
>>> If I read it right, the commit 46f7afa37096 example is pretty special that
>>> the QOM path more or less decided more than the hierachy itself but changes
>>> the existances of objects.
>>
>> Let's see whether I got this...
>>
>> We removed some useless objects, moved the useful ones to another home.
>> The move changed their QOM path.
>>
>> The problem was the removal of useless objects, because this also
>> removed their vmstate.
>
> If you checkout at the removal commit (5bc8d26de20c), the vmstate has
> been kept untouched.
>
>>
>> The fix was adding the vmstate back as a dummy.
>
> Since the vmstate was kept I don't see why would we need a dummy. The
> incoming migration stream would still have the state, only at a
> different point in the stream. It's surprising to me that that would
> cause an issue, but I'm not well versed in that code.

Alright, I understand neither the problem nor the fix :)

>> The QOM patch changes are *not* part of the problem.
>
> The only explanation I can come up with is that after the patch
> migration has broken after a hotplug or similar operation. In such
> situation, the preallocated state would always be present before the
> patch, but sometimes not present after the patch in case, say, a
> hot-unplug has taken away a cpu + ICPState.

My head hurts...  Oh, we're talking migration!  Perfectly normal then.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]