[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH] A simple way to DMA (and async. IO) to make win2003

From: Juergen Pfennig
Subject: [Qemu-devel] [PATCH] A simple way to DMA (and async. IO) to make win2003 happy
Date: Wed, 1 Feb 2006 11:38:08 +0100
User-agent: KMail/1.7.2

Hello ...

Please note: I will send the Async IO stuff later, see below.

This Patch: Here is a very simple patch to ide.c that does not change
            the controller type and that lets an existing win2003 in-
            stallation use multi-sector IO and/or DMA.

There is another patch for BSD (sorry, I am not sure about the author's
name - is it John?).  That patch changes the controler, modifies the
BIOS and adds a simple way for async writes. MY PATCH IS VERY DIFFERENT:

(1) The controller type is not changed. In qemu 8.0.0 some bits got set
    wrong, windows therefore disabled DMA.

(2) I have no documentation about the controller. WHO CAN SEND ME THE
    IDE CONTROLLER DOCUMENTATION? I took some ideas from John and
    used the linux kernel source. UNFORTUNATELY not all off my bits
    make the linux kernel happy - it complains and disables DMA. BUT

(3) My way to async IO is different, read below. The implementation
    does async read and write. My plans are to make the SDL async too.

(4) The async IO layer integrates well with Fabrice Bellard's IOHandlers.
    In fact vl.c needs no modifications, the code uses a pipe to signal
    that IO got ready. The modifications on the block driver layer are

To say this again: the async IO stuff itself is about 600 lines of code
(in one file) but it needs several small changes in qemu. Here are some
fake definitions that you might help you to get a first impression:

#ifdef  QEMU_TOOL
    // Global: start the background thread
    int qaio_initialize(IoAsyncInst* inst, IoAsyncCall* spy) { return 0; }

    // Global: terminate the background thread
    int qaio_terminate(IoAsyncInst* inst) { return 0; }

    // Global: IOHandler used to run the completion callback
    void qaio_poll(void* opaque) {}

    // Client: this function is only called once
    int qaio_register(IoAsyncInst* i, IoAsyncItem** iptr, void* o, int file)
    {   int* pfile = (int*)iptr; *pfile = file; return 0;   }

    // Client: this function should be called to flush pending requests
    int qaio_unregister(IoAsyncItem** iptr) { return 0; }

    // Client: make this a child
    int qaio_parent(IoAsyncItem* item, IoAsyncItem* parent) { return 0; }

    // Client: begin a new request (flush or commit must follow)
    int qaio_begin(IoAsyncItem* item, IoAsyncCall* cbf, void* info)
    {   return 1;   }

    // Client: commit any pending request
    int qaio_commit(IoAsyncItem* item) { return 1; }

    // Client: flush any pending request synchronously
    int qaio_flush(IoAsyncItem* item) { return 1; }

    // Client: queue a write request
    int qaio_write(IoAsyncItem* item, const void* pdat, uint32_t count,
                   uint64_t offs)
    {   return qemu_write_at((int)item, pdat, count, offs); }

    // Client: queue a read request
    int qaio_read(IoAsyncItem* item, void* pdat, uint32_t count,
                  uint64_t offs)
    {   return qemu_read_at((int)item, pdat, count, offs); }

The DMA stuff is very important for windows. With synchronous IO you
roughly get the following performance on a simple P4/2.4 GHz single disk

    6 MByte/s       No Multi-Sector, no DMA (qemu 8.0.0 default)
    9 MByte/s       Multi-Sector, no DMA (this Patch)
   12 MByte/s       with DMA (this Patch)

With async IO the throughput can go up further but this saturates my
workstation disk, so I will not quote any numbers here. I should also
mention that I use a faster block layer driver (called bkf) that I
will submit after it has matured for a while.

Whereas the DMA stuff fixes more or less a bug in qemu 8.0.0 the async
IO is really important. Here is why:

(1) The Windows GUI now works "smoothly". The emulated machine is more
    responsive and the mouse cursor does not "hang". Please remember
    that this also requires the hack to make VGA faster that I submitted
    a while ago.
(2) The windows clock looses much fewer timer ticks and now mostly shows
    the correct time. Perfmon now becomes really usable.
(3) Even under heavy IO load the CPU-usage now falls below 100% so that
    CPU cycles are for other activity on the host or the emulated PC.
(4) Async IO does not generally speed-up batch operations, but the
    emulated PC behaves more like a real one. Example: the animated
    boot logo of windows.

Yours Juergen

Attachment: ide_dma.diff
Description: Text Data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]