[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH-for-4.1 v6 0/5] hw/block/pflash_cfi01: Add Devic

From: Philippe Mathieu-Daudé
Subject: Re: [Qemu-devel] [PATCH-for-4.1 v6 0/5] hw/block/pflash_cfi01: Add DeviceReset() handler
Date: Wed, 17 Jul 2019 16:00:06 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0

Hi Laszlo,

On 7/17/19 2:24 PM, Laszlo Ersek wrote:
> Hi Phil,
> On 07/17/19 00:15, Philippe Mathieu-Daudé wrote:
>> Hello it's me again, insisting with this series because there are at
>> least 2 different report of guests bricked on reset due to the bug
>> fixed by patch #5:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1678713
>> https://bugzilla.redhat.com/show_bug.cgi?id=1704584
>> Patches missing review: 2 and 3
>> The pflash device lacks a reset() function.
>> When a machine is resetted, the flash might be in an
>> inconsistent state, leading to unexpected behavior.
>> Resolve this issue by adding a DeviceReset() handler.
>> Since v1: https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg00962.html
>> - addressed Laszlo review comments
>> Since v2: https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg00395.html
>> - consider migration (Laszlo, Peter)
>> Since v3: https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg01668.html
>> - more reliable migration (Dave)
>> - dropped patches 6-9 not required for next release
>> Since v4: https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg02785.html
>> - document why using READ_ARRAY value 0x00 for migration is safe
>> Since v5: https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg03366.html
>> - avoid trying to be spec-compliant and messing with migration. KISS.
>>   review/test tags reset, sorry.
> I have a number of questions / requests.
> (1) The last I recall from the v5 discussion is Markus asking about some
> risky cases (corner cases?) related to migration.
> So what is the new avenue taken in v6? What does "continue ignoring spec
> compliance" mean, with regard to migration?
> My vague understanding is that we're not trying to answer Markus's
> questions; instead, we're side-stepping them, by doing something else.

I'd love to keep the v5 series and address Markus issues, but as shown
by the quantity of respins, I don't understand the migration feature
enough to fix it in time for the next release. I plan to address them
(and other issues reported by Markus in other reviews) during the next
dev cycle.

> That works for me, but can we please summarize in a bit more detail?
> Like, "in v6, we're not mapping 0x00 vs. 0xff across migration because..."

Since it is very late in the release process and this series intend to
fix a guest corruption bug worthwhile for release, the approch taken by
v6 is "try to change the strict minimum regarding to migration, do not
worry about spec issues". I even tried to make no migration change at
all, but as explained in patch 6 "Extract pflash_mode_read_array" there
is still one.

I could make no migration change, and in patch 6 not call
memory_region_rom_device_set_romd() in pflash_mode_read_array() [and
call it in other places instead], but then we still have the undefined
behavior described in the patch. We always lived with this UNDEF, so...
I could work on a simpler v7.

> Yes, I could inter-diff v5 vs. v6, but I know way too little about
> pflash. I'd miss how our *dropping* of the special 0xff mapping impacts
> our conformance to the data sheet.

To reassure you, it never worked well but nobody cared, I noticed while
converting DPRINTF to trace-events and adding more, then follow the
model state machine. While it's probably not worth fixing, it makes
debugging slighly harder when looking at the CFI spec workflow. Now I'm
used to it.

> I'm not asking for commit message updates, just a bit more explanation
> in this free-form discussion. (I looked for it under v5, and couldn't
> find it.)

I tried to be verbose in the patch description, so for reference:

0xff on v5:

not 0xff on v6:

migration impact on v6 [1]:

> (2) Has someone regression-tested v6 for migration specifically?

No, neither were tested the previous versions.

> Or, is v6 not "risky" wrt. migration any longer?

v6 should be way less risky than previous versions (still one issue
noted in [1]).

> (3) I'm fine regression testing v6 too (without migration, again).
> Please ping me separately once the reviews have converged and the series
> is otherwise ready for merging.

Yes, I know your testing takes very long, so let's wait first.

Thanks for having a look.


> Thanks!
> Laszlo
>> $ git backport-diff -u v5
>> Key:
>> [----] : patches are identical
>> [####] : number of functional differences between upstream/downstream patch
>> [down] : patch is downstream-only
>> The flags [FC] indicate (F)unctional and (C)ontextual differences, 
>> respectively
>> 001/5:[----] [-C] 'hw/block/pflash_cfi01: Removed an unused timer'
>> 002/5:[down] 'hw/block/pflash_cfi01: Document use of non-CFI compliant 
>> command '0x00''
>> 003/5:[0031] [FC] 'hw/block/pflash_cfi01: Extract pflash_mode_read_array()'
>> 004/5:[down] 'hw/block/pflash_cfi01: Rename 'reset_flash' label as 
>> 'mode_read_array''
>> 005/5:[----] [--] 'hw/block/pflash_cfi01: Add the DeviceReset() handler'
>> Regards,
>> Phil.
>> Philippe Mathieu-Daudé (5):
>>   hw/block/pflash_cfi01: Removed an unused timer
>>   hw/block/pflash_cfi01: Document use of non-CFI compliant command
>>     '0x00'
>>   hw/block/pflash_cfi01: Extract pflash_mode_read_array()
>>   hw/block/pflash_cfi01: Rename 'reset_flash' label as 'mode_read_array'
>>   hw/block/pflash_cfi01: Add the DeviceReset() handler
>>  hw/block/pflash_cfi01.c | 77 +++++++++++++++++++++--------------------
>>  hw/block/trace-events   |  1 +
>>  2 files changed, 41 insertions(+), 37 deletions(-)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]