qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] From virtio_kick until VM-exit?


From: charls chap
Subject: Re: [Qemu-devel] From virtio_kick until VM-exit?
Date: Wed, 27 Jul 2016 12:19:52 +0300

Hello All,

I am new with qemu, I am trying to understand the I/O path of a synchronous
I/O.
It turns out, that I've not a clear picture. Definitely for VM-exit and
VM-entry parts.


Some generic questions first, and some other questions inline :)



1) if i am correct:
When we run QEMU in emulation mode, WITHOUT kvm. Then we run on TCG runtime
No vcpus threads?

qemu_tcg_cpu_thread_fn
tcg_exec_all();

No interactions with kvm module. On the other hand, when we have
virtualization, there are no
interactions with any part of the tcg implementation.

The tb_gen_code in translate-all, and find_slot and find_fast,  its not
part of the tcg, and there still
"executed, in the KVM case?
So if we have
for (;;)
c++;

vcpu thread executes code, using cpu-exec?

2)
What is this pipe, i mean between who?
when is used?
int event_notifier_test_and_clear(EventNotifier *e)
{
    int value;
    ssize_t len;
    char buffer[512];

    /* Drain the notify pipe.  For eventfd, only 8 bytes will be read.  */
    value = 0;
    do {
        len = read(e->rfd, buffer, sizeof(buffer));
        value |= (len > 0);
    } while ((len == -1 && errno == EINTR) || len == sizeof(buffer));

    return value;
}

3)
I've tried to trace iothread,
It seems that the following functions executed once:
iothread_class_init
iothread_register_types

But i have no idea, when static void *iothread_run(void *opaque)
Acutally when iothread is created?






On Wed, Jul 27, 2016 at 10:13 AM, Stefan Hajnoczi <address@hidden>
wrote:

> On Tue, Jul 26, 2016 at 9:08 PM, charls chap <address@hidden> wrote:
> > Let's say that we run a VM over QEMU-KVM. One vpcu, driver virtio for
> block.
> > An app.c at first does some trivial stuff (non-privileged instructions)
> and
> > then performs a synchronous I/O(O_DIRECT, O_SYNC).
> >
> >
> >
> > my understanding for the path is:
> > VCPU executes guest code in TCG -- for the trivial stuff, Then it comes
> the
> > write(), so,
> > vcpu switches to guest kernel, it goes down the kernel path until the
> > kick(PIO)
> > then vpcu blocks(0: vcpu thread is blocked in guest kernel? if yes,
> spinning
> > or
> > waiting on a condition?) and then (1: What are the invocations until it
> > reaches
> > "the other side")
> >
> > Then kvm, handles the exit, and switches to userspace, iothread takes
> action
> > (3: how is this happening?)
>
> You mentioned TCG above and now you mentioned KVM.  Either TCG
> (just-in-time compiler) or KVM (hardware virtualization extensions)
> can be used but not both at the same time.  TCG is used to translate
> instructions from the guest architecture to the host architecture,
> e.g. ARM guest on x86 host.  KVM is used to efficiently execute
> same-on-same, e.g. x86 guest on x86 host.  I'll assume you are using
> just KVM in your examples.
>
> The guest virtio_pci.ko driver contains an instruction that writes to
> the VIRTIO_PCI_QUEUE_NOTIFY hardware register.  This will cause a
> "vmexit" (a trap from guest mode back to host mode) and the kvm.ko
> host kernel module will inspect this trapped instruction and decide
> that it's an ioeventfd write.  The ioeventfd file descriptor will be
> signalled (it becomes readable).
>

This decision is made in the static int vmx_handle_exit
<http://lxr.free-electrons.com/ident?v=2.6.33;i=vmx_handle_exit>(struct
kvm_vcpu <http://lxr.free-electrons.com/ident?v=2.6.33;i=kvm_vcpu> *vcpu)
 (kvm/vmx.c)?


What does it mean " The ioeventfd file descriptor will be
signalled (it becomes readable)."



> During the time in kvm.ko the guest vcpu is not executing because no
> host CPU is in guest mode for that vcpu context.  There is no spinning
> or waiting as you mentioned above.  The host CPU is simply busy doing
> other things and the guest vcpu is not running during that time.
>

If vcpu is not sleeping, then it means, that vcpu didn't execute the kick
in the guest kernel.




> After the ioeventfd has been signalled, kvm.ko does a vmenter and
> resumes guest code execution.  The guest finds itself back after the
> instruction that wrote to VIRTIO_PCI_QUEUE_NOTIFY.
>
> During this time there has been no QEMU userspace activity because
> ioeventfd signalling happens in the kernel in the kvm.ko module.  So
> QEMU is still inside ioctl(KVM_RUN).
>
>
iothread is in control and this is the thread that will follow the
common kernel path for the I/O submit and completion. I mean, that
iothread, will be waiting in Host kernel, I/O wait queue,
after the submission of I/O.

In the meantime, kvm does a VM_ENTRY to where?
Since, the intrerrupt is not completed, the return point couldn't be the
guest interrupt handler...



> Now it's up to the host kernel to schedule the thread that is
> monitoring the ioeventfd file descriptor.  The ioeventfd has become
> readable so hopefully the scheduler will soon dispatch the QEMU event
> loop thread that is waiting in epoll(2)/ppoll(2).  Once the QEMU
> thread wakes up it will execute the virtio-blk device emulation code
> that processes the virtqueue.  The guest vcpu may be executing during
> this time.
>


> > 4: And then there is a virtual interrupt injection and VM ENTRY to guest
> > kernel,
> > so vcpu is unblocked and it executes the complete_bottom_halve?
>
> No, the interrupt injection is independent of the vmenter.  As
> mentioned above, the vcpu may run while virtio-blk device emulation
> happens (when ioeventfd is used, which is the default setting).
>
> The vcpu will receive an interrupt and jump to the virtio_pci
> interrupt handler function, which calls virtio_blk.ko function to
> process completed requests from the virtqueue.
>
>

from which thread in what function  VM-exit-to  which point in kvm.ko?
and
from which point of kvm.ko   VM-entry-to  which point/function in qemu?

Virtual interrupt injection from which point of host kernel to which
point/function in QEMU?




> I'm not going further since my answers have changed the
> assumptions/model that you were considering.  Maybe it's all clear to
> you now.  Otherwise please email the QEMU mailing list at
> address@hidden and CC me instead of emailing me directly.  That
> way others can participate (e.g. if I'm busy and unable to reply
> quickly).
>
> Stefan
>


Thanks in advance for your time and patience


reply via email to

[Prev in Thread] Current Thread [Next in Thread]