[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STO
From: |
Alexander Graf |
Subject: |
Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations |
Date: |
Fri, 06 Feb 2015 08:56:32 +0100 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 |
On 06.02.15 03:54, David Gibson wrote:
> On Thu, Feb 05, 2015 at 12:55:45PM +0100, Alexander Graf wrote:
>>
>>
>> On 05.02.15 12:30, David Gibson wrote:
>>> On Thu, Feb 05, 2015 at 11:22:13AM +0100, Alexander Graf wrote:
> [snip]
>>>>>>>>>> [snip]
>>>>>>>>>>
>>>>>>>>>>> + ret1 = kvmppc_enable_hcall(kvm_state, H_LOGICAL_CI_LOAD);
>>>>>>>>>>> + if (ret1 != 0) {
>>>>>>>>>>> + fprintf(stderr, "Warning: error enabling H_LOGICAL_CI_LOAD
>>>>>>>>>>> in KVM:"
>>>>>>>>>>> + " %s\n", strerror(errno));
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + ret2 = kvmppc_enable_hcall(kvm_state, H_LOGICAL_CI_STORE);
>>>>>>>>>>> + if (ret2 != 0) {
>>>>>>>>>>> + fprintf(stderr, "Warning: error enabling
>>>>>>>>>>> H_LOGICAL_CI_STORE in KVM:"
>>>>>>>>>>> + " %s\n", strerror(errno));
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + if ((ret1 != 0) || (ret2 != 0)) {
>>>>>>>>>>> + fprintf(stderr, "Warning: Couldn't enable H_LOGICAL_CI_*
>>>>>>>>>>> in KVM, SLOF"
>>>>>>>>>>> + " may be unable to operate devices with in-kernel
>>>>>>>>>>> emulation\n");
>>>>>>>>>>> + }
>>>>>>>>>>
>>>>>>>>>> You'll always get these warnings if you're running on an old (meaning
>>>>>>>>>> current upstream) kernel, which could be annoying.
>>>>>>>>>
>>>>>>>>> True.
>>>>>>>>>
>>>>>>>>>> Is there any way
>>>>>>>>>> to tell whether you have configured any devices which need the
>>>>>>>>>> in-kernel MMIO emulation and only warn if you have?
>>>>>>>>>
>>>>>>>>> In theory, I guess so. In practice I can't see how you'd enumerate
>>>>>>>>> all devices that might require kernel intervention without something
>>>>>>>>> horribly invasive.
>>>>>>>>
>>>>>>>> We could WARN_ONCE in QEMU if we emulate such a hypercall, but its
>>>>>>>> handler is io_mem_unassigned (or we add another minimum priority huge
>>>>>>>> memory region on all 64bits of address space that reports the
>>>>>>>> breakage).
>>>>>>>
>>>>>>> Would that work for the virtio+iothread case? I had the impression
>>>>>>> the kernel handled notification region was layered over the qemu
>>>>>>> emulated region in that case.
>>>>>>
>>>>>> IIRC we don't have a way to call back into kvm saying "please write to
>>>>>> this in-kernel device". But we could at least defer the warning to a
>>>>>> point where we know that we actually hit it.
>>>>>
>>>>> Right, but I'm saying we might miss the warning in cases where we want
>>>>> it, because the KVM device is shadowed by a qemu device, so qemu won't
>>>>> see the IO as unassigned or unhandled.
>>>>>
>>>>> In particular, I think that will happen in the case of virtio-blk with
>>>>> iothread, which is the simplest case in which to observe the problem.
>>>>> The virtio-blk device exists in qemu and is functional, but we rely on
>>>>> KVM catching the queue notification MMIO before it reaches the qemu
>>>>> implementation of the rest of the device's IO space.
>>>>
>>>> But in that case the VM stays functional and will merely see a
>>>> performance hit when using virtio in SLOF, no? I don't think that's
>>>> a problem worth worrying users about.
>>>
>>> Alas, no. The iothread stuff *relies* on the in-kernel notification,
>>> so it will not work if the IO gets punted to qemu. This is the whole
>>> reason for the in-kernel hcall implementation.
>>
>> So at least with vhost-net the in-kernel trapping is optional. If we
>> happen to get MMIO into QEMU, we'll just handle it there.
>>
>> Enlighten me why the iothread stuff can't handle it that way too.
>
> So, as I understand it, it could, but it doesn't. Working out how to
> fix it properly requires better understanding of the dataplane code
> than I currently possess,
>
> So, using virtio-blk as the example case. Normally the queue notify
> mmio will get routed by the general virtio code to
> virtio_blk_handle_output().
>
> In the case of dataplane, that just calls
> virtio_blk_data_plane_start(). So the first time we get a vq notify,
> the dataplane is started. That sets up the host notifier
> (VirtioBusClass::set_host_notifier -> virtio_pci_set_host_notifier ->
> virtio_pci_set_host_notifier_internal -> memory_region_add_eventfd()
> -> memory_region_transaction_commit() ->
> address_space_update_ioeventfds - >address_space_add_del_ioeventfds ->
> kvm_mem_ioeventfd_add -> kvm_set_ioeventfd_mmio -> KVM_IOEVENTFD
> ioctl)
>
> From this point on further calls to virtio_blk_handle_output() are
> IIUC a "can't happen", because vq notifies should go to the eventfd
> instead, where they will kick the iothread.
>
> So, with SLOF, the first request is ok - it hits
> virtio_blk_handle_output() which starts the iothread which goes on to
> process the request.
>
> On the second request, however, we get back into
> virtio_blk_data_plane_start() which sees the iothread is already
> running and aborts. I think it is assuming that this must be the
> result of a race with another vcpu starting the dataplane, and so
> assumes the racing thread will have woken the dataplane which will
> then handle this vcpu's request as well.
>
> In our case, however, the IO hcalls go through to
> virtio_blk_handle_output() when the dataplane already going, and
> become no-ops without waking it up again to handle the new request.
>
> Enlightened enough yet?
So reading this, it sounds like we could just add logic in the virtio
dataplane code that allows for a graceful fallback to QEMU based MMIO by
triggering the eventfd itself in the MMIO handler. When going via this
slow path, we should of course emit a warning (once) to the user ;).
Stefan, what do you think?
Alex
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, (continued)
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, Paul Mackerras, 2015/02/03
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, David Gibson, 2015/02/03
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, Alexander Graf, 2015/02/04
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, David Gibson, 2015/02/04
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, Alexander Graf, 2015/02/04
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, David Gibson, 2015/02/04
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, Alexander Graf, 2015/02/05
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, David Gibson, 2015/02/05
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, Alexander Graf, 2015/02/05
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, David Gibson, 2015/02/05
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations,
Alexander Graf <=
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, David Gibson, 2015/02/09
- Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations, Alexander Graf, 2015/02/08