qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes


From: Denis V. Lunev
Subject: Re: [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes
Date: Tue, 27 Oct 2015 22:05:55 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 10/27/2015 09:41 PM, Paolo Bonzini wrote:
On 27/10/2015 15:09, Denis V. Lunev wrote:
The following test
     while /bin/true ; do
         virsh snapshot-create rhel7
         sleep 10
         virsh snapshot-delete rhel7 --current
     done
with enabled iothreads on a running VM leads to a lot of troubles: hangs,
asserts, errors.

Though (in general) HMP snapshot code is terrible. I think it should be
dropped at once and replaced with blkdev transactions code. Though is
could not fit to QEMU 2.5/stable at all.

Anyway, I think that the construction like
     assert(aio_context_is_locked(aio_context));
should be widely used to ensure proper locking.

Changes from v1:
- aio-context locking added
- comment is rewritten

Signed-off-by: Denis V. Lunev <address@hidden>
CC: Stefan Hajnoczi <address@hidden>
CC: Paolo Bonzini <address@hidden>
For patches 4-5:

Reviewed-by: Paolo Bonzini <address@hidden>

For patches 1-3 I'm not sure, because we will remove RFifoLock
relatively soon and regular pthread recursive mutexes do not have an
equivalent of rfifolock_is_locked.

Paolo

This does not break any future.

Yes, FifoLock will go away, but aio_context_is_locked will
survive like it stays in the kernel code. We can either have
plain pthread_mutex_try_lock/unlock at first or we can
have additional stubs for linux with checks like this

(gdb)  p *(pthread_mutex_t*)0x6015a0
$3  =  {
  __data  =  {
    __lock  =  2,
    __count  =  0,
    __owner  =  12276,   <==  LWP12276  is Thread 3
    __nusers  =  1,
    __kind  =  0,        <==  non-recursive
    __spins  =  0,
    __list  =  {
      __prev  =  0x0,
      __next  =  0x0
    }
  },
  __size  =      "\002\000\000\000\000\000\000\000\364/\000\000\001",'\000'  
<repeats26  times>,
  __align  =  2
}

in debug mode. Yes, they relays on internal representation,
but they are useful.

This assert was VERY useful for me. I presume that there are
a LOT of similar places in the code with different functions
where aio_context lock was not acquired and there was no
way to ensure consistency.

Den



reply via email to

[Prev in Thread] Current Thread [Next in Thread]