[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH experiment 00/35] stackless coroutine backend
From: |
Daniel P . Berrangé |
Subject: |
Re: [PATCH experiment 00/35] stackless coroutine backend |
Date: |
Fri, 11 Mar 2022 12:17:06 +0000 |
User-agent: |
Mutt/2.1.5 (2021-12-30) |
On Fri, Mar 11, 2022 at 01:04:33PM +0100, Paolo Bonzini wrote:
> On 3/11/22 10:27, Stefan Hajnoczi wrote:
> > > Not quite voluntarily, but I noticed I had to add one 0 to make them run
> > > for
> > > a decent amount of time. So yeah, it's much faster than siglongjmp.
> > That's a nice first indication that performance will be good. I guess
> > that deep coroutine_fn stacks could be less efficient with stackless
> > coroutines compared to ucontext, but the cost of switching between
> > coroutines (enter/yield) will be lower with stackless coroutines.
>
> Note that right now I'm not placing the coroutine_fn stack on the heap, it's
> still allocated from a contiguous area in virtual address space. The
> contiguous allocation is wrapped by coroutine_stack_alloc and
> coroutine_stack_free, so it's really easy to change them to malloc and free.
>
> I also do not have to walk up the whole call stack on coroutine_fn yields,
> because calls from one coroutine_fn to the next are tail calls; in exchange
> for that, I have more indirect calls than if the code did
>
> if (next_call() == COROUTINE_YIELD) {
> return COROUTINE_YIELD;
> }
>
> For now the choice was again just the one that made the translation easiest.
>
> Today I also managed to implement a QEMU-like API on top of C++ coroutines:
>
> CoroutineFn<int> return_int() {
> co_await qemu_coroutine_yield();
> co_return 30;
> }
>
> CoroutineFn<void> return_void() {
> co_await qemu_coroutine_yield();
> }
>
> CoroutineFn<void> co(void *) {
> co_await return_void();
> printf("%d\n", co_await return_int())
> co_await qemu_coroutine_yield();
> }
>
> int main() {
> Coroutine *f = qemu_coroutine_create(co, NULL);
> printf("--- 0\n");
> qemu_coroutine_enter(f);
> printf("--- 1\n");
> qemu_coroutine_enter(f);
> printf("--- 2\n");
> qemu_coroutine_enter(f);
> printf("--- 3\n");
> qemu_coroutine_enter(f);
> printf("--- 4\n");
> }
>
> The runtime code is absurdly obscure; my favorite bit is
>
> Yield qemu_coroutine_yield()
> {
> return Yield();
> }
>
> :) However, at 200 lines of code it's certainly smaller than a
> source-to-source translator. It might be worth investigating a bit more.
> Only files that define or use a coroutine_fn (which includes callers of
> qemu_coroutine_create) would have to be compiled as C++.
Unless I'm misunderstanding what you mean, "define a coroutine_fn"
is a very large number of functions/files
$ git grep coroutine_fn | wc -l
806
$ git grep -l coroutine_fn | wc -l
132
Dominated by the block layer of course, but tentacles spreading
out into alot of other code.
Feels like identifying all callers would be tedious/unpleasant enough,
that practically speaking we would have to just compile all of QEMU
as C++.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
- [PATCH 30/35] qemu_co_rwlock_unlock, (continued)
- [PATCH 30/35] qemu_co_rwlock_unlock, Paolo Bonzini, 2022/03/10
- [PATCH 31/35] qemu_co_rwlock_downgrade, Paolo Bonzini, 2022/03/10
- [PATCH 32/35] qemu_co_rwlock_wrlock, Paolo Bonzini, 2022/03/10
- [PATCH 34/35] /locking/co-rwlock/upgrade, Paolo Bonzini, 2022/03/10
- [PATCH 35/35] /locking/co-rwlock/downgrade, Paolo Bonzini, 2022/03/10
- [PATCH 33/35] qemu_co_rwlock_upgrade, Paolo Bonzini, 2022/03/10
- Re: [PATCH experiment 00/35] stackless coroutine backend, Stefan Hajnoczi, 2022/03/10