qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Network connection with COLO VM


From: Dr. David Alan Gilbert
Subject: Re: Network connection with COLO VM
Date: Mon, 2 Dec 2019 09:58:06 +0000
User-agent: Mutt/1.12.1 (2019-06-15)

* Daniel Cho (address@hidden) wrote:
> Hi Zhang,
> 
> We use qemu-4.1.0 release on this case.
> 
> I think we need use block mirror to sync the disk to secondary node first,
> then stop the primary VM and build COLO system.
> 
> In the stop moment, you need add some netfilter and chardev socket node for
> COLO, maybe you need re-check this part.
> 
> 
> Our test was already follow those step. Maybe I could describe the detail
> of the test flow and issues.
> 
> 
> Step 1:
> 
> Create primary VM without any netfilter and chardev for COLO, and using
> other host ping primary VM continually.
> 
> 
> Step 2:
> 
> Create secondary VM (the same device/drive with primary VM), and do block
> mirror sync ( ping to primary VM normally )
> 
> 
> Step 3:
> 
> After block mirror sync finish, add those netfilter and chardev to primary
> VM and secondary VM for COLO ( *Can't* ping to primary VM but those packets
> will be received later )
> 
> 
> Step 4:
> 
> Start migrate primary VM to secondary VM, and primary VM & secondary VM are
> running ( ping to primary VM works and receive those packets on step 3
> status )
> 
> 
> 
> 
> Between Step 3 to Step 4, it will take 10~20 seconds in our environment.
> 
> I could image this issue (delay reply packets) is because of setting COLO
> proxy for temporary status,
> 
> but we thought 10~20 seconds might a little long. (If primary VM is already
> doing some jobs, it might lose the data.)
> 
> 
> Could we reduce those time? or those delay is depends on different VM?

I think you need to set up the netfilter and chardev on the primary at
the start;  the filter contains the state of the TCP connections working
with the VM, so adding it later can't gain that state for existing
connections.

Dave

> 
> Best Regard,
> 
> Daniel Cho.
> 
> 
> 
> Zhang, Chen <address@hidden> 於 2019年11月30日 週六 上午2:04寫道:
> 
> >
> >
> >
> >
> > *From:* Daniel Cho <address@hidden>
> > *Sent:* Friday, November 29, 2019 10:43 AM
> > *To:* Zhang, Chen <address@hidden>
> > *Cc:* Dr. David Alan Gilbert <address@hidden>; address@hidden;
> > address@hidden
> > *Subject:* Re: Network connection with COLO VM
> >
> >
> >
> > Hi David,  Zhang,
> >
> >
> >
> > Thanks for replying my question.
> >
> > We know why will occur this issue.
> >
> > As you said, the COLO VM's network needs
> >
> > colo-proxy to control packets, so the guest's
> >
> > interface should set the filter to solve the problem.
> >
> >
> >
> > But we found another question, when we set the
> >
> > fault-tolerance feature to guest (primary VM is running,
> >
> > secondary VM is pausing), the guest's network would not
> >
> > responds any request for a while (in our environment
> >
> > about 20~30 secs) after secondary VM runs.
> >
> >
> >
> > Does it be a normal situation, or a known issue?
> >
> >
> >
> > Our test is creating primary VM for a while, then creating
> >
> > secondary VM to make it with COLO feature.
> >
> >
> >
> > Hi Daniel,
> >
> >
> >
> > Happy to hear you have solved ssh disconnection issue.
> >
> >
> >
> > Do you use Lukas’s patch on this case?
> >
> > I think we need use block mirror to sync the disk to secondary node first,
> > then stop the primary VM and build COLO system.
> >
> > In the stop moment, you need add some netfilter and chardev socket node
> > for COLO, maybe you need re-check this part.
> >
> >
> >
> > Best Regard,
> >
> > Daniel Cho
> >
> >
> >
> > Zhang, Chen <address@hidden> 於 2019年11月28日 週四 上午9:26寫道:
> >
> >
> >
> > > -----Original Message-----
> > > From: Dr. David Alan Gilbert <address@hidden>
> > > Sent: Wednesday, November 27, 2019 6:51 PM
> > > To: Daniel Cho <address@hidden>; Zhang, Chen
> > > <address@hidden>; address@hidden
> > > Cc: address@hidden
> > > Subject: Re: Network connection with COLO VM
> > >
> > > * Daniel Cho (address@hidden) wrote:
> > > > Hello everyone,
> > > >
> > > > Could we ssh to colo VM (means PVM & SVM are starting)?
> > > >
> > >
> > > Lets cc in Zhang Chen and Lukas Straub.
> >
> > Thanks Dave.
> >
> > >
> > > > SSH will connect to colo VM for a while, but it will disconnect with
> > > > error
> > > > *client_loop: send disconnect: Broken pipe*
> > > >
> > > > It seems to colo VM could not keep network session.
> > > >
> > > > Does it be a known issue?
> > >
> > > That sounds like the COLO proxy is getting upset; it's supposed to
> > compare
> > > packets sent by the primary and secondary and only send one to the
> > outside
> > > - you shouldn't be talking directly to the guest, but always via the
> > proxy.  See
> > > docs/colo-proxy.txt
> > >
> >
> > Hi Daniel,
> >
> > I have try ssh to COLO guest with 8 hours, not occurred this issue.
> > Please check your network/qemu configuration.
> > But I found another problem maybe related this issue, if no network
> > communication for a period of time(maybe 10min), the first message send to
> > guest have a chance with delay(maybe 1-5 sec), I will try to fix it when I
> > have time.
> >
> > Thanks
> > Zhang Chen
> >
> > > Dave
> > >
> > > > Best Regard,
> > > > Daniel Cho
> > > --
> > > Dr. David Alan Gilbert / address@hidden / Manchester, UK
> >
> >
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK




reply via email to

[Prev in Thread] Current Thread [Next in Thread]