qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH experiment 00/35] stackless coroutine backend


From: Daniel P . Berrangé
Subject: Re: [PATCH experiment 00/35] stackless coroutine backend
Date: Fri, 11 Mar 2022 12:17:06 +0000
User-agent: Mutt/2.1.5 (2021-12-30)

On Fri, Mar 11, 2022 at 01:04:33PM +0100, Paolo Bonzini wrote:
> On 3/11/22 10:27, Stefan Hajnoczi wrote:
> > > Not quite voluntarily, but I noticed I had to add one 0 to make them run 
> > > for
> > > a decent amount of time.  So yeah, it's much faster than siglongjmp.
> > That's a nice first indication that performance will be good. I guess
> > that deep coroutine_fn stacks could be less efficient with stackless
> > coroutines compared to ucontext, but the cost of switching between
> > coroutines (enter/yield) will be lower with stackless coroutines.
> 
> Note that right now I'm not placing the coroutine_fn stack on the heap, it's
> still allocated from a contiguous area in virtual address space. The
> contiguous allocation is wrapped by coroutine_stack_alloc and
> coroutine_stack_free, so it's really easy to change them to malloc and free.
> 
> I also do not have to walk up the whole call stack on coroutine_fn yields,
> because calls from one coroutine_fn to the next are tail calls; in exchange
> for that, I have more indirect calls than if the code did
> 
>     if (next_call() == COROUTINE_YIELD) {
>         return COROUTINE_YIELD;
>     }
> 
> For now the choice was again just the one that made the translation easiest.
> 
> Today I also managed to implement a QEMU-like API on top of C++ coroutines:
> 
>     CoroutineFn<int> return_int() {
>         co_await qemu_coroutine_yield();
>         co_return 30;
>     }
> 
>     CoroutineFn<void> return_void() {
>         co_await qemu_coroutine_yield();
>     }
> 
>     CoroutineFn<void> co(void *) {
>         co_await return_void();
>         printf("%d\n", co_await return_int())
>         co_await qemu_coroutine_yield();
>     }
> 
>     int main() {
>         Coroutine *f = qemu_coroutine_create(co, NULL);
>         printf("--- 0\n");
>         qemu_coroutine_enter(f);
>         printf("--- 1\n");
>         qemu_coroutine_enter(f);
>         printf("--- 2\n");
>         qemu_coroutine_enter(f);
>         printf("--- 3\n");
>         qemu_coroutine_enter(f);
>         printf("--- 4\n");
>     }
> 
> The runtime code is absurdly obscure; my favorite bit is
> 
>     Yield qemu_coroutine_yield()
>     {
>         return Yield();
>     }
> 
> :) However, at 200 lines of code it's certainly smaller than a
> source-to-source translator.  It might be worth investigating a bit more.
> Only files that define or use a coroutine_fn (which includes callers of
> qemu_coroutine_create) would have to be compiled as C++.

Unless I'm misunderstanding what you mean, "define a coroutine_fn"
is a very large number of functions/files

  $ git grep coroutine_fn | wc -l
  806
  $ git grep -l coroutine_fn | wc -l
  132

Dominated by the block layer of course, but tentacles spreading
out into alot of other code.

Feels like identifying all callers would be tedious/unpleasant enough,
that practically speaking we would have to just compile all of QEMU
as C++.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]