qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/5] Add target memory mapping API


From: Ian Jackson
Subject: Re: [Qemu-devel] [PATCH 1/5] Add target memory mapping API
Date: Mon, 19 Jan 2009 16:57:01 +0000

Anthony Liguori writes ("Re: [Qemu-devel] [PATCH 1/5] Add target memory mapping 
API"):
> The packet IO API is a bit different.  It looks like:

The purpose here is to be able to make only one system call to the
host kernel in order to do an operation which involves a scatter
gather list in guest physical memory (as provided by the guest to eg
an emulated DMA controller) ?

And the idea is to try to map as much of that contiguously as
possible so that only one system call need be made ?

> while (offset < size) {
>   data = map(offset, &len)
>   if (data == NULL)
>      break;
>   sg[n_sg].iov_base = data;
>   sg[n_sg].iov_len = len;
>   n_sg++;
>   offset += len;
> }
...
> if (offset < len) {
>   for (i = 0; i < n_sg; i++)
>      unmap(sg[i].iov_base);
>    sg[0].iov_base = alloc_buffer(size);
>    sg[0].iov_len = size;
>    cpu_physical_memory_rw(sg[0].iov_base, size);
> }

Why is it necessary for there only to be one of these

> do IO on (sg)

calls ?

Are there supposed to be fast-path devices where we absolutely must
make the host system call for the whole transfer in one go, in one
contiguous memory region ?

Obviously for the fast path to be actually fast the whole mapping must
succeed as requested, but this could be achieved by an interface where
the mapping caller provided a scatter gather list in guest memory:

  typedef struct {
    target_phys_addr_t addr, len;
  } CpuPhysicalMemoryMappingEntry;

  typedef struct {
    /* to be filled in by caller before calling _map: */
    unsigned flags;
    const CpuPhysicalMemoryMappingEntry *sg_list;
    target_phys_addr_t total_len; /* gives sg_list length; updated by _map */
    /* filled in by _map: */
    void *buffer;
    /* private to _map et al: */
  } CpuPhysicalMemoryMapping;

  void cpu_physical_memory_map(CpuPhysicalMemoryMapping*;
      CpuPhysicalMemoryMapCallback *cb, void *cb_arg);
    /* There may be a limit on the number of concurrent maps
     * and the limit may be as low as one. */

If a caller doesn't want to deal with a complete sg_list then they do
  typedef struct {
    ...
    CpuPhysicalMemoryMapping dma_mapping;
    CpuPhysicalMemoryMappingRequest dma_request;
    ...
  } SomePCIDevice;
and do
  ourself->dma_mapping.sg_list = &ourself->dma_request;
  ourself->dma_mapping.total_len = ourself->dma_request.len;

> So this is why I prefer the map() API, as it accommodates two distinct 
> users in a way that the callback API wouldn't.  We can formalize these 
> idioms into an API, of course.

I don't think there is any fundamental difference between a callback
API and a polling API; you can implement whatever semantics you like
with either.

But callbacks are needed in at least some cases because of the way
that the bounce buffer may need to be reserved/released.  That means
all of the callers have to deal with callbacks anyway.

So it makes sense to make that the only code path.  That way callers
only need to be written once.

> BTW, to support this model, we have to reserve at least one bounce 
> buffer for cpu_physical_memory_rw.

Yes.

Ian.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]