qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] TCP based PCIE request forwarding


From: lementec fabien
Subject: Re: [Qemu-devel] TCP based PCIE request forwarding
Date: Thu, 22 Nov 2012 11:08:51 +0100

Hi,

Thanks for the feedback, I will modify the previous document
to include the changes you mentionned. I reply here too.

2012/11/22 Stefan Hajnoczi <address@hidden>:
> On Wed, Nov 21, 2012 at 03:27:48PM +0100, lementec fabien wrote:
>> usage
>> -----
>> PCIEFW devices are instanciated using the following QEMU options:
>> -device \
>>  pciefw,\
>>  laddr=<local_addr>,\
>>  lport=<local_port>,\
>>  raddr=<remote_addr>,\
>>  rport=<remote_port>
>
> Take a look at qemu_socket.h:socket_parse().  It should allow you to
> support TCP, UNIX domain sockets, and arbitrary file descriptors.
>

ok, I will have a look what it implies to support arbitrary file descriptor.
For instance, my current implementation does not work with UDP sockets.
It assumes a reliable, ordered transport layer, whose OS API is not datagram
oriented.

>> implementation
>> --------------
>> PCIEFW is a PCIE accesses forwarding device added to the QEMU source tree. At
>> initialization, this device opens a bidirectionnal point to point 
>> communication
>> channel with an external process. This process actually implements the PCIE
>> endpoint. That is, a PCIE access made by QEMU is forwarded to the process.
>> Reciprocally, replies and interrupts messages from the process are forwarded
>> to QEMU.
>>
>> The commnication currently relies on a bidirectionnal point to point TCP
>
> s/commnication/communication/
>
>> socket based channel. Byte ordering is little endian.
>>
>> PCIEFW initiates a request upon access from QEMU. It sends a message whose
>> format is described by the pciefw_msg_t type:
>>
>> typedef struct pciefw_msg
>> {
>> #define PCIEFW_MSG_MAX_SIZE (offsetof(pciefw_msg_t, data) + 0x1000)
>
> The size field is uint16_t.  Do you really want to limit to 4 KB of
> data?
>

My first implementation required to allocate a fixed size buffer. It
is no longer
the case (with non datagram oriented IO operations) since I included the header
that contains the message size. Since PCIE maximum payload size is 0x1000,
it was an obvious choice. Of course, it is an arbitrary choice.

>>
>>   pciefw_header_t header;
>>
>> #define PCIEFW_OP_READ_CONFIG 0
>> #define PCIEFW_OP_WRITE_CONFIG 1
>> #define PCIEFW_OP_READ_MEM 2
>> #define PCIEFW_OP_WRITE_MEM 3
>> #define PCIEFW_OP_READ_IO 4
>> #define PCIEFW_OP_WRITE_IO 5
>> #define PCIEFW_OP_INT 6
>> #define PCIEFW_OP_MSI 7
>> #define PCIEFW_OP_MSIX 8
>>
>>   uint8_t op; /* in PCIEFW_OP_XXX */
>>   uint8_t bar; /* in [0:5] */
>>   uint8_t width; /* access in 1, 2, 4, 8 */
>>   uint64_t addr;
>>   uint16_t size; /* data size, in bytes */
>
> Why is are both width and size fields?  For read-type operations the
> size field would indicate how many bytes to read.  For write-type
> operations the size field would indicate how many bytes are included in
> data[].
>

Actually, the width field is currently not required. I included it to
allow multiple
contiguous accesses in 1 operation (where count = size / width). The device
would still need know the width of individual accesses in this case. But this is
not used.

>>   uint8_t data[1];
>>
>> } __attribute__((packed)) pciefw_msg_t;
>>
>> Note that data is a variable length field.
>>
>> The PCIE endpoint process replies with a pciefw_reply_t formatted message:
>>
>> typedef struct pciefw_reply
>> {
>>   pciefw_header_t header;
>>   uint8_t status;
>
> What values does this field take?
>

I will define a PCIEFW_STATUS_XXX

>>   uint8_t data[8];
>> } __attribute__((packed)) pciefw_reply_t;
>>
>> The PCIE endpoint process can initiate pciefw_msg_t to perform write 
>> operations
>> of its own. This is used to perform data transfer (DMA engines ...) and send
>> interrupts.
>
> Any flow control rules?  For example, can the endpoint raise an
> interrupt while processing a message (before it sends a reply)?
>

Currently, messages are not identified so the transport is assumed
to be made in order. In practice, it works because of the LINUX
application I use does not starts 2 DMA transfers in parallel. But a
protocol cannot rely on such assumptions. Plus, I assume QEMU
can eventually make 2 PCIE concurrent accesses on the same
device which would lead to 2 replies. so I will add an identifier field.

>> Both types start with a pciefw_header containing the total size:
>>
>> typedef struct pciefw_header
>> {
>>   uint16_t size;
>> } __attribute__((packed)) pciefw_header_t;
>
> A "hello" message type would be useful so that you can extend the
> protocol in the future.  The message would contain feature bits or a
> version number.
>

I did think about it. More generally, it would be useful to have a control
message to allow an endpoint to be disconnected then reconnected
without having to reboot QEMU. It is very useful when developping a
new device.

> Stefan

I will send you the modified document,

Thanks,

Fabien.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]