qemu-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-discuss] 答复: Latest Qemu-COLO Problems


From: wenzt
Subject: [Qemu-discuss] 答复: Latest Qemu-COLO Problems
Date: Wed, 6 Mar 2019 18:27:57 +0800

I have tested Proxy with QMP: "{'execute': 'trace-event-set-state',
'arguments': {'name': 'colo*', 'enable': true} }"

 

I got this nothing except this logs on PVM side: 

address@hidden:colo_compare_main : secondary: unsupported packet in

address@hidden:colo_compare_main : secondary: unsupported packet in

address@hidden:colo_compare_main : secondary: unsupported packet in

address@hidden:colo_compare_main : primary: unsupported packet in

address@hidden:colo_compare_main : secondary: unsupported packet in

 

My guest OS is Centos 7.5.

I did nothing but boot up the OS.

After that, firing some net IO still get those logs.

 

I did some debug, maybe some parse error in parse_packet_early(), get the
wrong ETH_P_protocolName

 

Thanks,

Zhengtao

 

发件人: Zhang, Chen <address@hidden> 
发送时间: 2019年3月5日 23:32
收件人: wenzt <address@hidden>
抄送: 'qemu-discuss' <address@hidden>
主题: RE: Latest Qemu-COLO Problems

 

 

From: wenzt [mailto:address@hidden 
Sent: Thursday, February 28, 2019 10:00 AM
To: Zhang, Chen <address@hidden <mailto:address@hidden> >
Cc: 'qemu-discuss' <address@hidden <mailto:address@hidden>
>
Subject: 答复: Latest Qemu-COLO Problems

 

This version:  <https://github.com/coloft/qemu/tree/colo-v4.1-periodic-mode>
https://github.com/coloft/qemu/tree/colo-v4.1-periodic-mode

 

This is old version from 3 years ago, please drop it, use qemu upstream
codes.

 

Another question:

What is the relationship between Proxy and Checkpoint ?

 

When PVM and SVM send different net packet, proxy will send a signal to
COLO-frame to do a checkpoint.

 

Do they work together ? I guess we should set checkpoint interval longer
like 20s.

 

Yes, they work together, at the same time, we have periodic checkpoint
mechanism, like a timer. You can set the time too.

 

Does Proxy only works under network workload ? In my test, I feel like Proxy
not working.

 

Yes, as wiki said, colo-proxy compare the PVM and SVM packet to decide if do
checkpoint.

You can enable the COLO debug info to see proxy’s job in primary node like
this:

"{'execute': 'trace-event-set-state', 'arguments': {'name': 'colo*',
'enable': true} }"

 

 

Thanks

Zhang Chen

 

 

发件人: Zhang, Chen < <mailto:address@hidden> address@hidden> 
发送时间: 2019年2月28日 9:34
收件人: wenzt < <mailto:address@hidden> address@hidden>
抄送: 'qemu-discuss' < <mailto:address@hidden> address@hidden
org>
主题: RE: Latest Qemu-COLO Problems

 

Which version?

COLO project always said the PVM and SVM execute in parallel.

 

Thanks

Zhang Chen

 

From: wenzt [ <mailto:address@hidden> mailto:address@hidden 
Sent: Thursday, February 28, 2019 9:21 AM
To: Zhang, Chen < <mailto:address@hidden> address@hidden>
Cc: 'qemu-discuss' < <mailto:address@hidden>
address@hidden>
Subject: 答复: Latest Qemu-COLO Problems

 

But in earlier version, I noticed that SVM always inmigration status even
doing checkpoint.

No operation can be performed on SVM. 

 

Thanks, 

Zhengtao

 

发件人: Zhang, Chen < <mailto:address@hidden> address@hidden> 
发送时间: 2019年2月27日 18:45
收件人: wenzt < <mailto:address@hidden> address@hidden>
抄送: 'qemu-discuss' < <mailto:address@hidden> address@hidden
org>
主题: RE: Latest Qemu-COLO Problems

 

 

From: wenzt [ <mailto:address@hidden> mailto:address@hidden 
Sent: Wednesday, February 27, 2019 6:04 PM
To: Zhang, Chen < <mailto:address@hidden> address@hidden>
Cc: 'qemu-discuss' < <mailto:address@hidden>
address@hidden>
Subject: 答复: Latest Qemu-COLO Problems

 

Thanks for help !

 

I don’t know why we keep switching SVM between Run and Stop ?

Why we don’t keep SVM inmigration status ?

 

Because we need do checkpoint to sync all status between PVM and SVM.

We can’t guarantee that their status will be the same after a while.

 

Thanks

Zhang Chen

 

Thanks, 

Zhengtao

 

发件人: Zhang, Chen < <mailto:address@hidden> address@hidden> 
发送时间: 2019年2月26日 18:41
收件人: wenzt < <mailto:address@hidden> address@hidden>
抄送: 'qemu-discuss' < <mailto:address@hidden> address@hidden
org>
主题: RE: Latest Qemu-COLO Problems

 

By the way, please read the COLO wiki use this command to trigger failover
in secondary node:

 

{ 'execute': 'nbd-server-stop' }

{ "execute": "x-colo-lost-heartbeat" }

 

 

Thanks

Zhang Chen

 

From: Zhang, Chen 
Sent: Tuesday, February 26, 2019 2:46 PM
To: 'wenzt' < <mailto:address@hidden> address@hidden>
Cc: 'qemu-discuss' < <mailto:address@hidden>
address@hidden>
Subject: RE: Latest Qemu-COLO Problems

 

Sorry for slow response.

I have fixed this bug in this series:

 

 <https://lists.nongnu.org/archive/html/qemu-devel/2019-02/msg06920.html>
https://lists.nongnu.org/archive/html/qemu-devel/2019-02/msg06920.html

 

Please test it.

 

 

Thanks

Zhang Chen

 

From: wenzt [ <mailto:address@hidden> mailto:address@hidden 
Sent: Friday, February 15, 2019 7:54 PM
To: Zhang, Chen < <mailto:address@hidden> address@hidden>
Cc: 'qemu-discuss' < <mailto:address@hidden>
address@hidden>
Subject: Latest Qemu-COLO Problems

 

Hi Zhang,

 

I have tested COLO with qemu-3.1.0 follow
https://wiki.qemu.org/Features/COLO

 

I got this problems on PVM:

{"timestamp": {"seconds": 1550230616, "microseconds": 644348}, "event":
"STOP"}

{"timestamp": {"seconds": 1550230616, "microseconds": 719003}, "event":
"RESUME"}

{"timestamp": {"seconds": 1550230616, "microseconds": 743554}, "event":
"STOP"}

qemu-system-x86_64: Can't receive COLO message: Input/output error

qemu-system-x86_64: Can't receive COLO message: Input/output error

{"timestamp": {"seconds": 1550230618, "microseconds": 257209}, "event":
"COLO_EXIT", "data": {"mode": "primary", "reason": "error"}}

 

 

And on SVM:

{"timestamp": {"seconds": 1550230616, "microseconds": 731544}, "event":
"STOP"}

address@hidden:colo_vm_state_change
<mailto:address@hidden:colo_vm_state_change>  Change 'run' =>
'stop'

address@hidden:colo_send_message
<mailto:address@hidden:colo_send_message>  Send 'checkpoint-reply'
message

address@hidden:colo_receive_message
<mailto:address@hidden:colo_receive_message>  Receive
'vmstate-send' message

address@hidden:colo_flush_ram_cache_begin <mailto:address@hidden
759522:colo_flush_ram_cache_begin>  dirty_pages 18446744073708498780

address@hidden:colo_flush_ram_cache_end
<mailto:address@hidden:colo_flush_ram_cache_end>  

address@hidden:colo_receive_message
<mailto:address@hidden:colo_receive_message>  Receive
'vmstate-size' message

address@hidden:colo_send_message
<mailto:address@hidden:colo_send_message>  Send 'vmstate-received'
message

{"timestamp": {"seconds": 1550230616, "microseconds": 837436}, "event":
"RESUME"}

qemu-system-x86_64: block.c:5062: bdrv_detach_aio_context: Assertion
`!bs->walking_aio_notifiers' failed.

Aborted (core dumped)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]