qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter


From: Jason Wang
Subject: Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter
Date: Fri, 22 Jan 2016 11:15:54 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1


On 01/20/2016 06:30 PM, Wen Congyang wrote:
> On 01/20/2016 06:19 PM, Jason Wang wrote:
>> > 
>> > 
>> > On 01/20/2016 06:01 PM, Wen Congyang wrote:
>>> >> On 01/20/2016 02:54 PM, Jason Wang wrote:
>>>> >>>
>>>> >>> On 01/20/2016 11:29 AM, Zhang Chen wrote:
>>>>>> >>>>> Sure.
>>>>>> >>>>>
>>>>>> >>>>> Two main comments/suggestions:
>>>>>> >>>>>
>>>>>> >>>>> - TCP analysis is missed in current version, maybe you point a git 
>>>>>> >>>>> tree
>>>>>> >>>>> (or another version of RFC) to me for a better understanding of the
>>>>>> >>>>> design. (Just a skeleton for TCP should be sufficient to discuss).
>>>>>> >>>>> - I prefer to make the code as reusable as possible. So it's 
>>>>>> >>>>> better to
>>>>>> >>>>> split/decouple the reusable parts from the codes. So a vague idea 
>>>>>> >>>>> is:
>>>>>> >>>>>
>>>>>> >>>>> 1) Decouple the packet comparing from the netfilter. You've 
>>>>>> >>>>> achieved
>>>>>> >>>>> this 99% since the work has been done in a thread. Just let the 
>>>>>> >>>>> thread
>>>>>> >>>>> poll sockets directly, then the comparing have the possibility to 
>>>>>> >>>>> be
>>>>>> >>>>> reused by other kinds of dataplane.
>>>>>> >>>>> 2) Implement traffic mirror/redirector as filter.
>>>>>> >>>>> 3) Implement TCP seq rewriting as a filter.
>>>>>> >>>>>
>>>>>> >>>>> Then, in primary node, you need just a traffic mirror, which did:
>>>>>> >>>>> - mirror ingress traffic to secondary node
>>>>>> >>>>> - mirror outgress traffic to packet comparing thread
>>>>>> >>>>>
>>>>>> >>>>> And in secondadry node, you need two filters:
>>>>>> >>>>> - A TCP seq rewriter which adjust tcp sequence number.
>>>>>> >>>>> - A traffic redirector which redirect packet from a socket as 
>>>>>> >>>>> ingress
>>>>>> >>>>> traffic, and redirect outgress traffic to the socket which could be
>>>>>> >>>>> polled by remote packet comparing thread.
>>>>>> >>>>>   Thoughts?
>>>>>> >>>>>
>>>>>> >>>>> Thanks
>>>>>> >>>>>
>>>>>>> >>>>>> Thanks
>>>>>>> >>>>>> zhangchen
>>>>> >>>>
>>>>> >>>> Hi, Jason.
>>>>> >>>> We consider your suggestion to split/decouple
>>>>> >>>> the reusable parts from the codes.
>>>>> >>>> Due to filter plugin are traversed one by one in order
>>>>> >>>> we will split colo-proxy to three filters in each side.
>>>>> >>>>
>>>>> >>>> But in this plan,primary and secondary both have socket
>>>>> >>>> server,startup is a problem.
>>>> >>> I believe this issue could be solved by reusing socket chardev.
>>>> >>>
>>>>> >>>>
>>>>> >>>>  Primary qemu                                                      
>>>>> >>>> Secondary qemu
>>>>> >>>> +----------------------------------------------------------+      
>>>>> >>>> +-----------------------------------------------------------+
>>>>> >>>> | +-----------------------------------------------------+  |       | 
>>>>> >>>> +------------------------------------------------------+ |
>>>>> >>>> | |                                                     |  |       | 
>>>>> >>>> |                                                      | |
>>>>> >>>> | |                        guest                        |  |       | 
>>>>> >>>> |                        guest                         | |
>>>>> >>>> | |                                                     |  |       | 
>>>>> >>>> |                                                      | |
>>>>> >>>> | +-----------^--------------+--------------------------+  |       | 
>>>>> >>>> +---------------------+--------+-----------------------+ |
>>>>> >>>> |             |              |                             |      
>>>>> >>>> |                        ^        |                         |
>>>>> >>>> |             |              |                             |      
>>>>> >>>> |                        |        |                         |
>>>>> >>>> |             +-------------------------------------------------+ 
>>>>> >>>> |                        |        |                         |
>>>>> >>>> |  netfilter  |              |                             |    |  | 
>>>>> >>>>  
>>>>> >>>> netfilter            |        |                         |
>>>>> >>>> | +-----------------------------------------------------+  |    |  | 
>>>>> >>>> +------------------------------------------------------+ |
>>>>> >>>> | |           |              |     filter excute order  |  |    |  | 
>>>>> >>>> |                     |        |  filter excute order  | |
>>>>> >>>> | |           |              |    +-------------------> |  |    |  | 
>>>>> >>>> |                     |        | +-------------------> | |
>>>>> >>>> | |           |              |                          |  |    |  | 
>>>>> >>>> |                     |        |   TCP                 | |
>>>>> >>>> | | +---------+-+     +------v-----+    +----+ +-----+  |  |    |  | 
>>>>> >>>> | +-----------+   +---+----+---v+rewriter+  +--------+ | |
>>>>> >>>> | | |           |     |            |    |            |  |  |    |  | 
>>>>> >>>> | |           |   |        |             |  |        | | |
>>>>> >>>> | | |  mirror   |     |  redirect  +---->  compare   |  |  |   
>>>>> >>>> +--------> mirror   +---> adjust |   adjust    +-->redirect| | |
>>>>> >>>> | | |  client   |     |  server    |    |            |  |  |       | 
>>>>> >>>> | |  server   |   | ack    |   seq       |  |client  | | |
>>>>> >>>> | | |           |     |            |    |            |  |  |       | 
>>>>> >>>> | |           |   |        |             |  |        | | |
>>>>> >>>> | | +----^------+     +----^-------+    +-----+------+  |  |       | 
>>>>> >>>> | +-----------+   +--------+-------------+  +----+---+ | |
>>>>> >>>> | |      |     tx          |      rx          |     rx  |  |       | 
>>>>> >>>> |            tx                        all       |  rx | |
>>>>> >>>> | +-----------------------------------------------------+  |       | 
>>>>> >>>> +------------------------------------------------------+ |
>>>>> >>>> |        |                
>>>>> >>>> +-------------------------------------------------------------------------------------------+
>>>>> >>>>       
>>>>> >>>> |
>>>>> >>>> |        |                                    |            |      
>>>>> >>>> |                                                           |
>>>>> >>>> +----------------------------------------------------------+      
>>>>> >>>> +-----------------------------------------------------------+
>>>>> >>>>          |                                    |
>>>>> >>>>          |guest receive                       |guest send
>>>>> >>>>          |                                    |
>>>>> >>>> +--------+------------------------------------v------------+
>>>>> >>>> |                                                          |
>>>>> >>>> |                                                          |
>>>>> >>>> |                         tap                             
>>>>> >>>> |                              NOTE: filter direction is rx/tx/all
>>>>> >>>> |                                                         
>>>>> >>>> |                              rx:receive packets sent to the netdev
>>>>> >>>> |                                                         
>>>>> >>>> |                              tx:receive packets sent by the netdev
>>>>> >>>> +----------------------------------------------------------+
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>> >>> I still like to decouple comparer from netfilter. It have two obvious
>>>> >>> advantages:
>>>> >>>
>>>> >>> - make it can be reused by other dataplane (e.g vhost)
>>>> >>> - secondary redirector could redirect rx to comparer on primary node
>>>> >>> directly which simplify the design.
>>>> >>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> guest recv packet route
>>>>> >>>>
>>>>> >>>> primary
>>>>> >>>> tap --> mirror client filter
>>>>> >>>> mirror client will send packet to guest,at the
>>>>> >>>> same time, copy and forward packet to secondary
>>>>> >>>> mirror server.
>>>>> >>>>
>>>>> >>>> secondary
>>>>> >>>> mirror server filter --> TCP rewriter
>>>>> >>>> if recv packet is TCP packet,we will adjust ack
>>>>> >>>> and update TCP checksum, then send to secondary
>>>>> >>>> guest. else directly send to guest.
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> guest send packet route
>>>>> >>>>
>>>>> >>>> primary
>>>>> >>>> guest --> redirect server filter
>>>>> >>>> redirect server filter recv primary guest packet
>>>>> >>>> but do nothing, just pass to next filter.
>>>>> >>>>
>>>>> >>>> redirect server filter --> compare filter
>>>>> >>>> compare filter recv primary guest packet then
>>>>> >>>> waiting scondary redirect packet to compare it.
>>>>> >>>> if packet same,send primary packet and clear secondary
>>>>> >>>> packet, else send primary packet and do
>>>>> >>>> checkpoint.
>>>>> >>>>
>>>>> >>>> secondary
>>>>> >>>> guest --> TCP rewriter filter
>>>>> >>>> if the packet is TCP packet,we will adjust seq
>>>>> >>>> and update TCP checksum. then send it to
>>>>> >>>> redirect client filter. else directly send to
>>>>> >>>> redirect client filter.
>>>>> >>>>
>>>>> >>>> redirect client filter --> redirect server filter
>>>>> >>>> forward packet to primary
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> In failover scene(primary is down), the TCP rewriter will keep
>>>>> >>>> servicing
>>>>> >>>> for the TCP connection which is established after the last 
>>>>> >>>> checkpoint。
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> How about this plan?
>>>> >>> Sounds good.
>>>> >>>
>>>> >>> And there's indeed no need to differ client/server by reusing the 
>>>> >>> socket
>>>> >>> chardev. E.g:
>>>> >>>
>>>> >>> In primary node:
>>>> >>>
>>>> >>> ...
>>>> >>> -chardev socket,id=comparer0,host=ip_primary,port=X,server,nowait
>>>> >>> -chardev socket,id=comparer1,host=ip_primary,port=Y,server,nowait
>>>> >>> -chardev socket,id=mirrorer0,host=ip_primary,port=Z,server,nowait
>>>> >>> -netdev tap,id=hn0
>>>> >>> -traffic-mirrorer netdev=hn0,id=t0,indev=comparer0,outdev=mirrorer0
>>>> >>> -colo-comparer primary_traffic=comparer0,secondary_traffic=comparer1
>>> >> Why mirrorer has indev? 
>> > 
>> > 
>> > As I said in the previous mails. I would like to decouple packet
>> > comparing from netfilter. You've already done most of this since the
>> > comparing is done in an independent thread. So the indev here is to
>> > mirror the packet sent by guest to the packet comparing thread.
>> > 
>>> >> I think we can use traffic-redirector to do it.
>>> >> The command line is:
>>> >> -netdev tap,id=hn0
>>> >> -object traffic-mirrorer,id=f0,netdev=hn0,queue=tx,outdev=mirrorer0
>>> >> -object traffic-redirector,id=f1,netdev=hn0,queue=rx,outdev=comparer0
>>> >> -colo-comparer 
>>> >> primary_traffic=comparer0,secondary_traffic=comparer1,netdev=hn0
>>> >> In the comparer thread, we can use qemu_net_queue_send_iov() to send
>>> >> out the packet.
>>> >>
>>> >> Also, we can merge the socketdev comparer1 and mirrorer0.
>> > 
>> > It depends on whether or not packet comparing was done in a net filter
>> > (which I prefer not).
> I mean that: packet comapring is done in a thread, not a net filter.
> The flow of the packet sent from guest:
> 1. traffice-redirecotr, we will redirector the packet to comparer0, the next
>    filter will never see it.
> 2. comparing thread: read it from socket chardev comparer0
> 3. call qemu_net_queue_send_iov() to send it back to the netdev.

Ok, looks like I miss something.

My suggestion tries best to let the packet comparing not tie to filter
or netdev. But your suggestion still need it to be coupled with a
netdev. Any advantages of doing this (or is there a reason that packet
must be sent to netdev after doing comparing?). If not, why not just
mirror (duplicate the packet and forward it to a chardev, and pass the
original packet to the next filter or netdev)? And doing
qemu_net_queue_send_iov() to a netdev in another thread may need some
synchronization with iothread.

>
> Thanks
> Wen Congyang
>
>> > 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]