[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] From virtio_kick until VM-exit?

From: charls chap
Subject: Re: [Qemu-devel] From virtio_kick until VM-exit?
Date: Wed, 27 Jul 2016 12:19:52 +0300

Hello All,

I am new with qemu, I am trying to understand the I/O path of a synchronous
It turns out, that I've not a clear picture. Definitely for VM-exit and
VM-entry parts.

Some generic questions first, and some other questions inline :)

1) if i am correct:
When we run QEMU in emulation mode, WITHOUT kvm. Then we run on TCG runtime
No vcpus threads?


No interactions with kvm module. On the other hand, when we have
virtualization, there are no
interactions with any part of the tcg implementation.

The tb_gen_code in translate-all, and find_slot and find_fast,  its not
part of the tcg, and there still
"executed, in the KVM case?
So if we have
for (;;)

vcpu thread executes code, using cpu-exec?

What is this pipe, i mean between who?
when is used?
int event_notifier_test_and_clear(EventNotifier *e)
    int value;
    ssize_t len;
    char buffer[512];

    /* Drain the notify pipe.  For eventfd, only 8 bytes will be read.  */
    value = 0;
    do {
        len = read(e->rfd, buffer, sizeof(buffer));
        value |= (len > 0);
    } while ((len == -1 && errno == EINTR) || len == sizeof(buffer));

    return value;

I've tried to trace iothread,
It seems that the following functions executed once:

But i have no idea, when static void *iothread_run(void *opaque)
Acutally when iothread is created?

On Wed, Jul 27, 2016 at 10:13 AM, Stefan Hajnoczi <address@hidden>

> On Tue, Jul 26, 2016 at 9:08 PM, charls chap <address@hidden> wrote:
> > Let's say that we run a VM over QEMU-KVM. One vpcu, driver virtio for
> block.
> > An app.c at first does some trivial stuff (non-privileged instructions)
> and
> > then performs a synchronous I/O(O_DIRECT, O_SYNC).
> >
> >
> >
> > my understanding for the path is:
> > VCPU executes guest code in TCG -- for the trivial stuff, Then it comes
> the
> > write(), so,
> > vcpu switches to guest kernel, it goes down the kernel path until the
> > kick(PIO)
> > then vpcu blocks(0: vcpu thread is blocked in guest kernel? if yes,
> spinning
> > or
> > waiting on a condition?) and then (1: What are the invocations until it
> > reaches
> > "the other side")
> >
> > Then kvm, handles the exit, and switches to userspace, iothread takes
> action
> > (3: how is this happening?)
> You mentioned TCG above and now you mentioned KVM.  Either TCG
> (just-in-time compiler) or KVM (hardware virtualization extensions)
> can be used but not both at the same time.  TCG is used to translate
> instructions from the guest architecture to the host architecture,
> e.g. ARM guest on x86 host.  KVM is used to efficiently execute
> same-on-same, e.g. x86 guest on x86 host.  I'll assume you are using
> just KVM in your examples.
> The guest virtio_pci.ko driver contains an instruction that writes to
> the VIRTIO_PCI_QUEUE_NOTIFY hardware register.  This will cause a
> "vmexit" (a trap from guest mode back to host mode) and the kvm.ko
> host kernel module will inspect this trapped instruction and decide
> that it's an ioeventfd write.  The ioeventfd file descriptor will be
> signalled (it becomes readable).

This decision is made in the static int vmx_handle_exit
kvm_vcpu <http://lxr.free-electrons.com/ident?v=2.6.33;i=kvm_vcpu> *vcpu)

What does it mean " The ioeventfd file descriptor will be
signalled (it becomes readable)."

> During the time in kvm.ko the guest vcpu is not executing because no
> host CPU is in guest mode for that vcpu context.  There is no spinning
> or waiting as you mentioned above.  The host CPU is simply busy doing
> other things and the guest vcpu is not running during that time.

If vcpu is not sleeping, then it means, that vcpu didn't execute the kick
in the guest kernel.

> After the ioeventfd has been signalled, kvm.ko does a vmenter and
> resumes guest code execution.  The guest finds itself back after the
> instruction that wrote to VIRTIO_PCI_QUEUE_NOTIFY.
> During this time there has been no QEMU userspace activity because
> ioeventfd signalling happens in the kernel in the kvm.ko module.  So
> QEMU is still inside ioctl(KVM_RUN).
iothread is in control and this is the thread that will follow the
common kernel path for the I/O submit and completion. I mean, that
iothread, will be waiting in Host kernel, I/O wait queue,
after the submission of I/O.

In the meantime, kvm does a VM_ENTRY to where?
Since, the intrerrupt is not completed, the return point couldn't be the
guest interrupt handler...

> Now it's up to the host kernel to schedule the thread that is
> monitoring the ioeventfd file descriptor.  The ioeventfd has become
> readable so hopefully the scheduler will soon dispatch the QEMU event
> loop thread that is waiting in epoll(2)/ppoll(2).  Once the QEMU
> thread wakes up it will execute the virtio-blk device emulation code
> that processes the virtqueue.  The guest vcpu may be executing during
> this time.

> > 4: And then there is a virtual interrupt injection and VM ENTRY to guest
> > kernel,
> > so vcpu is unblocked and it executes the complete_bottom_halve?
> No, the interrupt injection is independent of the vmenter.  As
> mentioned above, the vcpu may run while virtio-blk device emulation
> happens (when ioeventfd is used, which is the default setting).
> The vcpu will receive an interrupt and jump to the virtio_pci
> interrupt handler function, which calls virtio_blk.ko function to
> process completed requests from the virtqueue.

from which thread in what function  VM-exit-to  which point in kvm.ko?
from which point of kvm.ko   VM-entry-to  which point/function in qemu?

Virtual interrupt injection from which point of host kernel to which
point/function in QEMU?

> I'm not going further since my answers have changed the
> assumptions/model that you were considering.  Maybe it's all clear to
> you now.  Otherwise please email the QEMU mailing list at
> address@hidden and CC me instead of emailing me directly.  That
> way others can participate (e.g. if I'm busy and unable to reply
> quickly).
> Stefan

Thanks in advance for your time and patience

reply via email to

[Prev in Thread] Current Thread [Next in Thread]