[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in
From: |
Laszlo Ersek |
Subject: |
Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code |
Date: |
Tue, 22 Jan 2013 14:17:51 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:10.0.11) Gecko/20121116 Thunderbird/10.0.11 |
On 01/22/13 11:25, Kevin Wolf wrote:
> Am 18.12.2012 10:46, schrieb Philipp Hahn:
>> Hello Kevin, hello Michael,
>>
>> On Wednesday 12 December 2012 17:54:58 Kevin Wolf wrote:
>>> Am 12.12.2012 15:09, schrieb Philipp Hahn:
>>>> Am Mittwoch 12 Dezember 2012 14:41:49 schrieb Kevin Wolf:
>>>>> As you can see in the commit message of that patch I was convinced that
>>>>> no bug did exist in practice and this was only dangerous with respect to
>>>>> future changes. Therefore my first question is if you're using an
>>>>> unmodified upstream qemu or if some backported patches are applied to
>>>>> it? If it's indeed unmodified, we should probably review the code once
>>>>> again to understand why it makes a difference.
>>>>
>>>> This were all unmodified versions directly from git between
>>>> "qemu-kvm-1.1.0" and "qemu-kvm-1.2.0"
>>>>
>>>> "git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c" works,
>>>> "git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c~1" is broken.
>>>> "git checkout qemu-kvm-1.1.2" is broken,
>>>> "git checkout qemu-kvm-1.1.2 ; git cherry-pick
>>>> b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c" works
>>>
>>> Ok, thanks for clarifying. Then I must have missed some interesting case
>>> while doing the patch.
>>
>> I think I found your missing link:
>> After filling in "QCowL2Meta *m", that request ist queued:
>> QLIST_INSERT_HEAD(&s->cluster_allocs, m, next_in_flight);
>> do prevent double allocating the same cluster for overlapping requests,
>> which
>> is checked in do_alloc_cluster_offset().
>> I guess that since the sector count was wrong, the overlap detection didn't
>> work and the two concurrent write requests to the same cluster overwrote
>> each
>> other.
>
> I'm still not seeing it. The overlap detection code doesn't use
> m->nb_available at all, so why would it make a difference?
>
> I guess I need some people who can decently review code - Laszlo, Paolo,
> any idea why upstream commit b7ab0fea does change the behaviour,
> contrary to what I claimed in the commit message?
As I understand it, the question is not whether b7ab0fea works or not
(it does), the question is whether *without* b7ab0fea there is a real
problem (user visible bug).
Commit b7ab0fea decreased "avail_sectors" by
keep_clusters << (s->cluster_bits - BDRV_SECTOR_BITS)
which should make a difference for "m->nb_available" in two cases:
(a)
avail_sectors_patched <
requested_sectors <=
avail_sectors_unpatched
(ie. the decrease in "avail_sectors" flipped MIN to prefer it over
requested_sectors, lowering m->nb_available)
(b)
avail_sectors_patched <
avail_sectors_unpatched <=
requested_sectors
(ie. MIN had already preferred "avail_sectors", and now it went lower).
It seems that
1 << (s->cluster_bits - BDRV_SECTOR_BITS) == s->cluster_sectors [1]
Therefore both "avail_sectors_patched" and "avail_sectors_unpatched" are
an integral multiple of "s->cluster_sectors". Whenever MIN() selects
either of them, the condition in qcow2_alloc_cluster_link_l2() will
evaluate to false.
In case (b) there seems to be no difference between patched & unpatched
in this regard: MIN() always selects an integral multiple, and so
copy_sectors() is never invoked.
In case (a), the patched version skips copy_sectors(), while the
unpatched version invokes it.
I'm not sure if case (a) is possible at all -- let's expand it:
avail_sectors_patched <
requested_sectors <=
avail_sectors_unpatched
Substitute the assigned values:
nb_cluster << (s->cluster_bits - BDRV_SECTOR_BITS) <
n_end - keep_clusters * s->cluster_sectors <=
(keep_clusters + nb_clusters) << (s->cluster_bits - BDRV_SECTOR_BITS)
Using [1], replace the shifts with multiplication:
nb_cluster * s->cluster_sectors <
n_end - keep_clusters * s->cluster_sectors <=
(keep_clusters + nb_clusters) * s->cluster_sectors
Distribute the addition:
nb_cluster * s->cluster_sectors <
n_end - keep_clusters * s->cluster_sectors <=
keep_clusters * s->cluster_sectors +
nb_clusters * s->cluster_sectors
N := nb_cluster * s->cluster_sectors [bytes]
K := keep_clusters * s->cluster_sectors [bytes]
N < n_end - K <= K + N
Add K
K + N < n_end <= 2*K + N [2]
I have no clue about qcow2, but *each one* of K and N seem to be a
function of *both* n_end and the current "qcow2 disk state" (ie. sectors
being allocated vs. just referenced). IOW, I suspect that case (a) is
possible for some (disk state, n_end) pairs. And case (a) -- if indeed
possible -- seems to contradict the commit message:
allocation. A COW occurs only if the request wasn't cluster aligned,
^^^^^^^ ^^^^^^^^^^^^^^^^^^^
which in turn would imply that requested_sectors was less than
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
avail_sectors (both in the original and in the fixed version). In this
^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^
In case (a) -- expanded as [2] --, "requested_sectors" is greater than
"avail_sectors_patched", and less than or equal to
"avail_sectors_unpatched".
Laszlo