[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PULL 10/40] migration: Delay start of migration main r
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] [PULL 10/40] migration: Delay start of migration main routines |
Date: |
Tue, 22 May 2018 18:20:37 +0200 |
User-agent: |
Mutt/1.9.1 (2017-09-22) |
Am 18.05.2018 um 14:14 hat Kevin Wolf geschrieben:
> Am 18.05.2018 um 12:34 hat Dr. David Alan Gilbert geschrieben:
> > * Kevin Wolf (address@hidden) wrote:
> > > Am 16.05.2018 um 01:39 hat Juan Quintela geschrieben:
> > > > We need to make sure that we have started all the multifd threads.
> > > >
> > > > Signed-off-by: Juan Quintela <address@hidden>
> > > > Reviewed-by: Daniel P. Berrangé <address@hidden>
> > >
> > > This commit makes qemu-iotests 091 hang for me. Either it breaks
> > > backward compatibility intentionally and we need to update the test
> > > case, or there is a bug somewhere.
> >
> > It's not an intentional break.
> > And the avocado tcp and exec migrations pass OK, so hmm.
>
> In case it helps, 169 fails as well and I got a core dump of an aborting
> QEMU process:
>
> (gdb) bt
> #0 0x00007ff079f779fb in raise () at /lib64/libc.so.6
> #1 0x00007ff079f79800 in abort () at /lib64/libc.so.6
> #2 0x00007ff079f700da in __assert_fail_base () at /lib64/libc.so.6
> #3 0x00007ff079f70152 in () at /lib64/libc.so.6
> #4 0x000055c2126f067b in bdrv_close_all () at block.c:3375
> #5 0x000055c2123c54a6 in main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized out>) at vl.c:4682
>
> If I understand correctly, that assertion failure means that someone is
> still holding a reference to a block device after all user-owned
> references have been closed. I suppose this was the source qemu and
> the migration hasn't been completed properly, though I haven't looked at
> the code yet and this idea might be completely wrong.
>
> Anyway, 091 is certainly the simpler test case to play with, but maybe
> this gives you another hint.
Any news on this? This is starting to become really annoying as a
hanging test suite impacts my ability to properly test block layer
patches.
If there is no hope of quickly getting a proper fix for this, we may
have to revert something for now to fight the symptoms at least.
Kevin
- [Qemu-devel] [PULL 03/40] tests: Migration ppc now inlines its program, (continued)
- [Qemu-devel] [PULL 03/40] tests: Migration ppc now inlines its program, Juan Quintela, 2018/05/15
- [Qemu-devel] [PULL 05/40] migration: Introduce multifd_recv_new_channel(), Juan Quintela, 2018/05/15
- [Qemu-devel] [PULL 04/40] migration: Set error state in case of error, Juan Quintela, 2018/05/15
- [Qemu-devel] [PULL 06/40] migration: terminate_* can be called for other threads, Juan Quintela, 2018/05/15
- [Qemu-devel] [PULL 09/40] migration: Create multifd channels, Juan Quintela, 2018/05/15
- [Qemu-devel] [PULL 08/40] migration: Export functions to create send channels, Juan Quintela, 2018/05/15
- [Qemu-devel] [PULL 10/40] migration: Delay start of migration main routines, Juan Quintela, 2018/05/15
[Qemu-devel] [PULL 11/40] migration: Transmit initial package through the multifd channels, Juan Quintela, 2018/05/15
[Qemu-devel] [PULL 13/40] migration: let incoming side use thread context, Juan Quintela, 2018/05/15
[Qemu-devel] [PULL 12/40] migration: Define MultifdRecvParams sooner, Juan Quintela, 2018/05/15
[Qemu-devel] [PULL 07/40] migration: Be sure all recv channels are created, Juan Quintela, 2018/05/15
[Qemu-devel] [PULL 14/40] migration: new postcopy-pause state, Juan Quintela, 2018/05/15
[Qemu-devel] [PULL 16/40] migration: allow dst vm pause on postcopy, Juan Quintela, 2018/05/15
[Qemu-devel] [PULL 17/40] migration: allow src return path to pause, Juan Quintela, 2018/05/15
[Qemu-devel] [PULL 18/40] migration: allow fault thread to pause, Juan Quintela, 2018/05/15
[Qemu-devel] [PULL 19/40] qmp: hmp: add migrate "resume" option, Juan Quintela, 2018/05/15