qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Faster, generic IO/DMA model with vectored AIO?


From: Blue Swirl
Subject: [Qemu-devel] Faster, generic IO/DMA model with vectored AIO?
Date: Sat, 27 Oct 2007 15:56:47 +0300

Hi,

I changed Slirp output to use vectored IO to avoid the slowdown from
memcpy (see the patch for the work in progress, gives a small
performance improvement). But then I got the idea that using AIO would
be nice at the outgoing end of the network IO processing. In fact,
vectored AIO model could even be used for the generic DMA! The benefit
is that no buffering or copying should be needed.

Instead of
void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
                            int len, int is_write);
and its device variant, we'd have something like
int qemu_lio_listio(int mode, struct GenericAIOcb *list[], unsigned
int nent, IOCompletionFunc *cb);

Each stage would translate the IO list and callback as needed and only
the final stage would perform the IO or memcpy. This would be used in
each stage of the chain memory<->IOMMU<->device<->SLIRP<->host network
device. Of course some kind of host support for vectored AIO for these
devices is required. On target side, devices that can do
scatter/gather DMA would benefit most.

For the specific Sparc32 case, unfortunately Lance bus byte swapping
makes buffering necessary at that stage, unless we can make N vectors
with just a single byte faster than memcpy + bswap of memory block
with size N.

Comments?

Attachment: slirp_iov.diff
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]