qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter


From: Wen Congyang
Subject: Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter
Date: Fri, 22 Jan 2016 15:46:42 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

On 01/22/2016 03:42 PM, Jason Wang wrote:
> 
> 
> On 01/22/2016 02:47 PM, Wen Congyang wrote:
>> On 01/22/2016 02:21 PM, Jason Wang wrote:
>>>
>>> On 01/22/2016 01:56 PM, Wen Congyang wrote:
>>>> On 01/22/2016 01:41 PM, Jason Wang wrote:
>>>>>>
>>>>>> On 01/22/2016 11:28 AM, Wen Congyang wrote:
>>>>>>>> On 01/22/2016 11:15 AM, Jason Wang wrote:
>>>>>>>>>> On 01/20/2016 06:30 PM, Wen Congyang wrote:
>>>>>>>>>>>> On 01/20/2016 06:19 PM, Jason Wang wrote:
>>>>>>>>>>>>>>>> On 01/20/2016 06:01 PM, Wen Congyang wrote:
>>>>>>>>>>>>>>>>>>>> On 01/20/2016 02:54 PM, Jason Wang wrote:
>>>>>>>>>>>>>>>>>>>>>>>> On 01/20/2016 11:29 AM, Zhang Chen wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Two main comments/suggestions:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - TCP analysis is missed in current version, 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> maybe you point a git tree
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (or another version of RFC) to me for a better 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> understanding of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> design. (Just a skeleton for TCP should be 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sufficient to discuss).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - I prefer to make the code as reusable as 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> possible. So it's better to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> split/decouple the reusable parts from the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> codes. So a vague idea is:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1) Decouple the packet comparing from the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> netfilter. You've achieved
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this 99% since the work has been done in a 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thread. Just let the thread
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> poll sockets directly, then the comparing have 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the possibility to be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reused by other kinds of dataplane.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2) Implement traffic mirror/redirector as 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filter.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3) Implement TCP seq rewriting as a filter.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Then, in primary node, you need just a traffic 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mirror, which did:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - mirror ingress traffic to secondary node
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - mirror outgress traffic to packet comparing 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thread
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> And in secondadry node, you need two filters:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - A TCP seq rewriter which adjust tcp sequence 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> number.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - A traffic redirector which redirect packet 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from a socket as ingress
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> traffic, and redirect outgress traffic to the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> socket which could be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> polled by remote packet comparing thread.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   Thoughts?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> zhangchen
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, Jason.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> We consider your suggestion to split/decouple
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the reusable parts from the codes.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Due to filter plugin are traversed one by one in 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> order
>>>>>>>>>>>>>>>>>>>>>>>>>>>> we will split colo-proxy to three filters in each 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> side.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> But in this plan,primary and secondary both have 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>>>>>>>>>>>>> server,startup is a problem.
>>>>>>>>>>>>>>>>>>>>>>>> I believe this issue could be solved by reusing socket 
>>>>>>>>>>>>>>>>>>>>>>>> chardev.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>  Primary qemu                                      
>>>>>>>>>>>>>>>>>>>>>>>>>>>>                 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Secondary qemu
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +----------------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>       
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-----------------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-----------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   |       | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +------------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>  |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | |                                                
>>>>>>>>>>>>>>>>>>>>>>>>>>>>      |  |       | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                                                  
>>>>>>>>>>>>>>>>>>>>>>>>>>>>     | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | |                        guest                   
>>>>>>>>>>>>>>>>>>>>>>>>>>>>      |  |       | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                        guest                     
>>>>>>>>>>>>>>>>>>>>>>>>>>>>     | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | |                                                
>>>>>>>>>>>>>>>>>>>>>>>>>>>>      |  |       | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                                                  
>>>>>>>>>>>>>>>>>>>>>>>>>>>>     | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-----------^--------------+--------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   |       | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +---------------------+--------+-----------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>  |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |             |              |                     
>>>>>>>>>>>>>>>>>>>>>>>>>>>>         |      
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                        ^        |                
>>>>>>>>>>>>>>>>>>>>>>>>>>>>          |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |             |              |                     
>>>>>>>>>>>>>>>>>>>>>>>>>>>>         |      
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                        |        |                
>>>>>>>>>>>>>>>>>>>>>>>>>>>>          |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |             
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                        |        |                
>>>>>>>>>>>>>>>>>>>>>>>>>>>>          |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |  netfilter  |              |                     
>>>>>>>>>>>>>>>>>>>>>>>>>>>>         |    |  |  
>>>>>>>>>>>>>>>>>>>>>>>>>>>> netfilter            |        |                    
>>>>>>>>>>>>>>>>>>>>>>>>>>>>      |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-----------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   |    |  | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +------------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>  |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | |           |              |     filter excute 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> order  |  |    |  | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                     |        |  filter excute 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> order  | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | |           |              |    
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-------------------> |  |    |  | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                     |        | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-------------------> | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | |           |              |                     
>>>>>>>>>>>>>>>>>>>>>>>>>>>>      |  |    |  | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                     |        |   TCP             
>>>>>>>>>>>>>>>>>>>>>>>>>>>>     | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | | +---------+-+     +------v-----+    +----+ 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-----+  |  |    |  | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | +-----------+   +---+----+---v+rewriter+  
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +--------+ | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | | |           |     |            |    |          
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   |  |  |    |  | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | |           |   |        |             |  |      
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   | | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | | |  mirror   |     |  redirect  +---->  compare 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   |  |  |   
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +--------> mirror   +---> adjust |   adjust    
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-->redirect| | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | | |  client   |     |  server    |    |          
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   |  |  |       | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | |  server   |   | ack    |   seq       |  
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |client  | | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | | |           |     |            |    |          
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   |  |  |       | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | |           |   |        |             |  |      
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   | | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | | +----^------+     +----^-------+    
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-----+------+  |  |       | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | +-----------+   +--------+-------------+  
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +----+---+ | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | |      |     tx          |      rx          |    
>>>>>>>>>>>>>>>>>>>>>>>>>>>>  rx  |  |       | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |            tx                        all       | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>  rx | |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-----------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   |       | 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +------------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>  |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |        |                
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-------------------------------------------------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>       
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |        |                                    |    
>>>>>>>>>>>>>>>>>>>>>>>>>>>>         |      
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                                                  
>>>>>>>>>>>>>>>>>>>>>>>>>>>>          |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +----------------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>       
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +-----------------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>          |                                    |
>>>>>>>>>>>>>>>>>>>>>>>>>>>>          |guest receive                       
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |guest send
>>>>>>>>>>>>>>>>>>>>>>>>>>>>          |                                    |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +--------+------------------------------------v------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                                                  
>>>>>>>>>>>>>>>>>>>>>>>>>>>>         |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                                                  
>>>>>>>>>>>>>>>>>>>>>>>>>>>>         |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                         tap                      
>>>>>>>>>>>>>>>>>>>>>>>>>>>>        
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                              NOTE: filter 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> direction is rx/tx/all
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                                                  
>>>>>>>>>>>>>>>>>>>>>>>>>>>>        
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                              rx:receive packets 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> sent to the netdev
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                                                  
>>>>>>>>>>>>>>>>>>>>>>>>>>>>        
>>>>>>>>>>>>>>>>>>>>>>>>>>>> |                              tx:receive packets 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> sent by the netdev
>>>>>>>>>>>>>>>>>>>>>>>>>>>> +----------------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I still like to decouple comparer from netfilter. It 
>>>>>>>>>>>>>>>>>>>>>>>> have two obvious
>>>>>>>>>>>>>>>>>>>>>>>> advantages:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> - make it can be reused by other dataplane (e.g vhost)
>>>>>>>>>>>>>>>>>>>>>>>> - secondary redirector could redirect rx to comparer 
>>>>>>>>>>>>>>>>>>>>>>>> on primary node
>>>>>>>>>>>>>>>>>>>>>>>> directly which simplify the design.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> guest recv packet route
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> primary
>>>>>>>>>>>>>>>>>>>>>>>>>>>> tap --> mirror client filter
>>>>>>>>>>>>>>>>>>>>>>>>>>>> mirror client will send packet to guest,at the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> same time, copy and forward packet to secondary
>>>>>>>>>>>>>>>>>>>>>>>>>>>> mirror server.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> secondary
>>>>>>>>>>>>>>>>>>>>>>>>>>>> mirror server filter --> TCP rewriter
>>>>>>>>>>>>>>>>>>>>>>>>>>>> if recv packet is TCP packet,we will adjust ack
>>>>>>>>>>>>>>>>>>>>>>>>>>>> and update TCP checksum, then send to secondary
>>>>>>>>>>>>>>>>>>>>>>>>>>>> guest. else directly send to guest.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> guest send packet route
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> primary
>>>>>>>>>>>>>>>>>>>>>>>>>>>> guest --> redirect server filter
>>>>>>>>>>>>>>>>>>>>>>>>>>>> redirect server filter recv primary guest packet
>>>>>>>>>>>>>>>>>>>>>>>>>>>> but do nothing, just pass to next filter.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> redirect server filter --> compare filter
>>>>>>>>>>>>>>>>>>>>>>>>>>>> compare filter recv primary guest packet then
>>>>>>>>>>>>>>>>>>>>>>>>>>>> waiting scondary redirect packet to compare it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> if packet same,send primary packet and clear 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> secondary
>>>>>>>>>>>>>>>>>>>>>>>>>>>> packet, else send primary packet and do
>>>>>>>>>>>>>>>>>>>>>>>>>>>> checkpoint.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> secondary
>>>>>>>>>>>>>>>>>>>>>>>>>>>> guest --> TCP rewriter filter
>>>>>>>>>>>>>>>>>>>>>>>>>>>> if the packet is TCP packet,we will adjust seq
>>>>>>>>>>>>>>>>>>>>>>>>>>>> and update TCP checksum. then send it to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> redirect client filter. else directly send to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> redirect client filter.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> redirect client filter --> redirect server filter
>>>>>>>>>>>>>>>>>>>>>>>>>>>> forward packet to primary
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> In failover scene(primary is down), the TCP 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> rewriter will keep
>>>>>>>>>>>>>>>>>>>>>>>>>>>> servicing
>>>>>>>>>>>>>>>>>>>>>>>>>>>> for the TCP connection which is established after 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the last checkpoint。
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> How about this plan?
>>>>>>>>>>>>>>>>>>>>>>>> Sounds good.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> And there's indeed no need to differ client/server by 
>>>>>>>>>>>>>>>>>>>>>>>> reusing the socket
>>>>>>>>>>>>>>>>>>>>>>>> chardev. E.g:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> In primary node:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>>>>>>>>>> -chardev 
>>>>>>>>>>>>>>>>>>>>>>>> socket,id=comparer0,host=ip_primary,port=X,server,nowait
>>>>>>>>>>>>>>>>>>>>>>>> -chardev 
>>>>>>>>>>>>>>>>>>>>>>>> socket,id=comparer1,host=ip_primary,port=Y,server,nowait
>>>>>>>>>>>>>>>>>>>>>>>> -chardev 
>>>>>>>>>>>>>>>>>>>>>>>> socket,id=mirrorer0,host=ip_primary,port=Z,server,nowait
>>>>>>>>>>>>>>>>>>>>>>>> -netdev tap,id=hn0
>>>>>>>>>>>>>>>>>>>>>>>> -traffic-mirrorer 
>>>>>>>>>>>>>>>>>>>>>>>> netdev=hn0,id=t0,indev=comparer0,outdev=mirrorer0
>>>>>>>>>>>>>>>>>>>>>>>> -colo-comparer 
>>>>>>>>>>>>>>>>>>>>>>>> primary_traffic=comparer0,secondary_traffic=comparer1
>>>>>>>>>>>>>>>>>>>> Why mirrorer has indev? 
>>>>>>>>>>>>>>>> As I said in the previous mails. I would like to decouple 
>>>>>>>>>>>>>>>> packet
>>>>>>>>>>>>>>>> comparing from netfilter. You've already done most of this 
>>>>>>>>>>>>>>>> since the
>>>>>>>>>>>>>>>> comparing is done in an independent thread. So the indev here 
>>>>>>>>>>>>>>>> is to
>>>>>>>>>>>>>>>> mirror the packet sent by guest to the packet comparing thread.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I think we can use traffic-redirector to do it.
>>>>>>>>>>>>>>>>>>>> The command line is:
>>>>>>>>>>>>>>>>>>>> -netdev tap,id=hn0
>>>>>>>>>>>>>>>>>>>> -object 
>>>>>>>>>>>>>>>>>>>> traffic-mirrorer,id=f0,netdev=hn0,queue=tx,outdev=mirrorer0
>>>>>>>>>>>>>>>>>>>> -object 
>>>>>>>>>>>>>>>>>>>> traffic-redirector,id=f1,netdev=hn0,queue=rx,outdev=comparer0
>>>>>>>>>>>>>>>>>>>> -colo-comparer 
>>>>>>>>>>>>>>>>>>>> primary_traffic=comparer0,secondary_traffic=comparer1,netdev=hn0
>>>>>>>>>>>>>>>>>>>> In the comparer thread, we can use 
>>>>>>>>>>>>>>>>>>>> qemu_net_queue_send_iov() to send
>>>>>>>>>>>>>>>>>>>> out the packet.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Also, we can merge the socketdev comparer1 and mirrorer0.
>>>>>>>>>>>>>>>> It depends on whether or not packet comparing was done in a 
>>>>>>>>>>>>>>>> net filter
>>>>>>>>>>>>>>>> (which I prefer not).
>>>>>>>>>>>> I mean that: packet comapring is done in a thread, not a net 
>>>>>>>>>>>> filter.
>>>>>>>>>>>> The flow of the packet sent from guest:
>>>>>>>>>>>> 1. traffice-redirecotr, we will redirector the packet to 
>>>>>>>>>>>> comparer0, the next
>>>>>>>>>>>>    filter will never see it.
>>>>>>>>>>>> 2. comparing thread: read it from socket chardev comparer0
>>>>>>>>>>>> 3. call qemu_net_queue_send_iov() to send it back to the netdev.
>>>>>>>>>> Ok, looks like I miss something.
>>>>>>>>>>
>>>>>>>>>> My suggestion tries best to let the packet comparing not tie to 
>>>>>>>>>> filter
>>>>>>>>>> or netdev. But your suggestion still need it to be coupled with a
>>>>>>>>>> netdev. Any advantages of doing this (or is there a reason that 
>>>>>>>>>> packet
>>>>>>>>>> must be sent to netdev after doing comparing?). If not, why not just
>>>>>>>> Yes, the packet must be sent to netdev after doing comparing. If both
>>>>>>>> the primary packet and secondary packet are the same(contains the same
>>>>>>>> application level data), we will drop the secondary packet, and send 
>>>>>>>> the
>>>>>>>> primary packet to the netdev. Otherwise, we will sync the state.
>>>>>> And drop primary packet also here?
>>>> No, the primary packet must be sent back to the netdev, so the client can 
>>>> receive
>>>> the response.
>>>>
>>>> For example:
>>>> 1. guest has a ftp server
>>>> 2. we connect to the ftp server via the network
>>>> 3. both primary guest and secondary guest receive this request
>>>> 4. both primary guest and secondary guest ack it
>>>> 5. we compare these two ack packets in the comparing thread
>>>> 6. it is the same(the seqno is different, but it is not important, we can 
>>>> modify it in
>>>>    colo-rewriter). So we drop the secondary packets, and sent back the 
>>>> primary packet
>>>>    to netdev
>>>> 7. The primary ack packet is sent to the ftp client via netdev.
>>>>
>>>> The ftp client only cares of the received packet. So if the packets from 
>>>> primay
>>>> and secondary guest contain the same data, we can say they are in the 
>>>> "same" state.
>>>>
>>>> Thanks
>>>> Wen Congyang
>>>>
>>> Thanks for the example. But still don't get why it must be done before
>>> comparing consider it will always be sent regardless the result of
>>> comparing?
>> Our goal is that: the connection is OK after failover, and the user doesn't 
>> know one of
>> the hosts crashed.
>>
>> If it sent out regardless the result of comparing, and primary host crashes. 
>> The connection
>> may be corrupted after failover. For example: the packet from primary and 
>> secondary host
>> contains different host, and we send the primary packet before comparing. 
>> The primary host
>> crashes before comparing these two packets. After failover, the connection 
>> may be reset or
>> the client doesn't receive the correct data, or some unexpected problems 
>> occurs.
>>
>> Another example(tcp):
>> 1. primary guest acks 100, and secondary guest only ack 95(some packet is 
>> lost in the guest)
>> 2. client doesn't resend the lost packet
>> 3. the connection will be recovered after the next checkpoint
>> If we do failover before the next checkpoint, there is no way to recover 
>> this connection.
>>
>> If we send out the packet after comparing, we can assume that the client 
>> always receives the
>> same data.
> 
> Thanks. I think I get the point. So if there's a difference, primary
> packet will only be sent after checkpoint and we could not assume the
> checkpoint itself is reliable.

Yes.

> 
> Back to the filters design. We'd better still decouple packet comparing
> out of netdev. Maybe a little bit more tweak on what you've suggested:
> 
> -netdev tap,id=hn0
> -object traffic-mirrorer,id=f0,netdev=hn0,queue=tx,outdev=mirrorer0
> -object
> traffic-redirector,id=f1,netdev=hn0,queue=rx,outdev=comparer0,indev=comparer2
> -colo-comparer
> primary_traffic=comparer0,secondary_traffic=comparer1,outdev=comparer2
> 
> Just add one more socket for comparer for sending primary packet, and
> let f1 redirector its output to netdev?

OK, I understand it now.
Thanks for your suggestion.

Wen Congyang

> 
>>
>> Thanks
>> Wen Congyang
>>
>>>
>>>
>>> .
>>>
>>
>>
> 
> 
> 
> .
> 






reply via email to

[Prev in Thread] Current Thread [Next in Thread]