[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->loc
From: |
Michael Weiser |
Subject: |
Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock |
Date: |
Fri, 25 Oct 2019 12:35:10 +0200 |
User-agent: |
Mutt/1.12.2 (2019-09-21) |
Hi Kevin,
On Wed, Oct 23, 2019 at 05:37:49PM +0200, Kevin Wolf wrote:
> > qcow2_cache_do_get() requires that s->lock is locked because it can
> > yield between picking a cache entry and actually taking ownership of it
> > by setting offset and increasing the reference count.
> >
> > Add an assertion to make sure the caller really holds the lock. The
> > function can be called outside of coroutine context, where bdrv_pread
> > and flushes become synchronous operations. The lock cannot and need not
> > be taken in this case.
> I'm still running tests to see if any other code paths trigger the
> assertion, but image creation calls this without the lock held (which is
> harmless because nobody else knows about the image so there won't be
> concurrent requests). The following patch is needed additionally to make
> image creation work with the new assertion.
I can confirm that with all four patches corruption does no longer
occur as of commit 69f47505ee66afaa513305de0c1895a224e52c45. Removing
only 3/3 (qcow2: Fix corruption bug in
qcow2_detect_metadata_preallocation()) the assertion triggers after a
few seconds, leaving behind a few leaked clusters but no errors in the
image.
(qemu) qemu-system-x86_64:qemu/include/qemu/coroutine.h:175:
qemu_co_mutex_assert_locked: Assertion `mutex->locked && mutex->holder
== qemu_coroutine_self()' failed.
Aborted (core dumped)
$ qemu-img check qtest.qcow2
Leaked cluster 169257 refcount=3 reference=2
Leaked cluster 172001 refcount=1 reference=0
Leaked cluster 172002 refcount=1 reference=0
Leaked cluster 172003 refcount=1 reference=0
Leaked cluster 172004 refcount=1 reference=0
Leaked cluster 172005 refcount=1 reference=0
Leaked cluster 172006 refcount=1 reference=0
Leaked cluster 172007 refcount=1 reference=0
Leaked cluster 172008 refcount=1 reference=0
Leaked cluster 172009 refcount=1 reference=0
Leaked cluster 172010 refcount=1 reference=0
Leaked cluster 172011 refcount=1 reference=0
Leaked cluster 172012 refcount=1 reference=0
13 leaked clusters were found on the image.
This means waste of disk space, but no harm to data.
255525/327680 = 77.98% allocated, 3.22% fragmented, 0.00% compressed
clusters
Image end offset: 17106403328
I was going to test with master as well but got overtaken by v2. Will
move on to test v2 now. :)
Series:
Tested-by: Michael Weiser <address@hidden>
No biggie but if there's a chance could you switch my address to the
above?
--
Thanks,
Michael
- [PATCH 0/3] qcow2: Fix image corruption bug in 4.1, Kevin Wolf, 2019/10/23
- [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock, Kevin Wolf, 2019/10/23
- Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock, Denis Lunev, 2019/10/24
- Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock, Vladimir Sementsov-Ogievskiy, 2019/10/24
- Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock, Kevin Wolf, 2019/10/24
[PATCH 3/3] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation(), Kevin Wolf, 2019/10/23