[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PULL v3 25/38] coroutine-ucontext: use __thread
From: |
Stefan Hajnoczi |
Subject: |
[Qemu-devel] [PULL v3 25/38] coroutine-ucontext: use __thread |
Date: |
Tue, 13 Jan 2015 13:48:03 +0000 |
From: Paolo Bonzini <address@hidden>
ELF thread local storage is about 10% faster on tests/test-coroutine's
perf/cost test. The timing on my machine is 190ns per iteration with
pthread TLS, 170 with ELF TLS.
Based on a patch by Kevin Wolf and Peter Lieven, but redone to follow
the model of coroutine-win32.c (including the important "noinline"
attribute!).
Platforms without thread-local storage (OpenBSD probably?) will need
a new-enough GCC for this to compile, in order to use the same emutls
support that Windows already relies on.
Signed-off-by: Paolo Bonzini <address@hidden>
Reviewed-by: Fam Zheng <address@hidden>
Message-id: address@hidden
Signed-off-by: Stefan Hajnoczi <address@hidden>
---
coroutine-ucontext.c | 69 +++++++++++++++-------------------------------------
1 file changed, 19 insertions(+), 50 deletions(-)
diff --git a/coroutine-ucontext.c b/coroutine-ucontext.c
index 4bf2cde..259fcb4 100644
--- a/coroutine-ucontext.c
+++ b/coroutine-ucontext.c
@@ -25,7 +25,6 @@
#include <stdlib.h>
#include <setjmp.h>
#include <stdint.h>
-#include <pthread.h>
#include <ucontext.h>
#include "qemu-common.h"
#include "block/coroutine_int.h"
@@ -48,15 +47,8 @@ typedef struct {
/**
* Per-thread coroutine bookkeeping
*/
-typedef struct {
- /** Currently executing coroutine */
- Coroutine *current;
-
- /** The default coroutine */
- CoroutineUContext leader;
-} CoroutineThreadState;
-
-static pthread_key_t thread_state_key;
+static __thread CoroutineUContext leader;
+static __thread Coroutine *current;
/*
* va_args to makecontext() must be type 'int', so passing
@@ -68,36 +60,6 @@ union cc_arg {
int i[2];
};
-static CoroutineThreadState *coroutine_get_thread_state(void)
-{
- CoroutineThreadState *s = pthread_getspecific(thread_state_key);
-
- if (!s) {
- s = g_malloc0(sizeof(*s));
- s->current = &s->leader.base;
- pthread_setspecific(thread_state_key, s);
- }
- return s;
-}
-
-static void qemu_coroutine_thread_cleanup(void *opaque)
-{
- CoroutineThreadState *s = opaque;
-
- g_free(s);
-}
-
-static void __attribute__((constructor)) coroutine_init(void)
-{
- int ret;
-
- ret = pthread_key_create(&thread_state_key, qemu_coroutine_thread_cleanup);
- if (ret != 0) {
- fprintf(stderr, "unable to create leader key: %s\n", strerror(errno));
- abort();
- }
-}
-
static void coroutine_trampoline(int i0, int i1)
{
union cc_arg arg;
@@ -193,15 +155,23 @@ void qemu_coroutine_delete(Coroutine *co_)
g_free(co);
}
-CoroutineAction qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
- CoroutineAction action)
+/* This function is marked noinline to prevent GCC from inlining it
+ * into coroutine_trampoline(). If we allow it to do that then it
+ * hoists the code to get the address of the TLS variable "current"
+ * out of the while() loop. This is an invalid transformation because
+ * the sigsetjmp() call may be called when running thread A but
+ * return in thread B, and so we might be in a different thread
+ * context each time round the loop.
+ */
+CoroutineAction __attribute__((noinline))
+qemu_coroutine_switch(Coroutine *from_, Coroutine *to_,
+ CoroutineAction action)
{
CoroutineUContext *from = DO_UPCAST(CoroutineUContext, base, from_);
CoroutineUContext *to = DO_UPCAST(CoroutineUContext, base, to_);
- CoroutineThreadState *s = coroutine_get_thread_state();
int ret;
- s->current = to_;
+ current = to_;
ret = sigsetjmp(from->env, 0);
if (ret == 0) {
@@ -212,14 +182,13 @@ CoroutineAction qemu_coroutine_switch(Coroutine *from_,
Coroutine *to_,
Coroutine *qemu_coroutine_self(void)
{
- CoroutineThreadState *s = coroutine_get_thread_state();
-
- return s->current;
+ if (!current) {
+ current = &leader.base;
+ }
+ return current;
}
bool qemu_in_coroutine(void)
{
- CoroutineThreadState *s = pthread_getspecific(thread_state_key);
-
- return s && s->current->caller;
+ return current && current->caller;
}
--
2.1.0
- [Qemu-devel] [PULL v3 12/38] qapi: Fix document for BlockStats.node-name, (continued)
- [Qemu-devel] [PULL v3 12/38] qapi: Fix document for BlockStats.node-name, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 15/38] qmp: Add command 'blockdev-backup', Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 16/38] block: Add blockdev-backup to transaction, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 17/38] qemu-iotests: Test blockdev-backup in 055, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 18/38] iotests: Filter out "I/O thread spun..." warning, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 21/38] .gitignore: Ignore generated "common.env", Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 19/38] migration/block: fix pending() return value, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 22/38] qemu-iotests: Replace "/bin/true" with "true", Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 23/38] qemu-iotests: Add "_supported_os Linux" to 058, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 20/38] libqos: Convert malloc-pc allocator to a generic allocator, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 25/38] coroutine-ucontext: use __thread,
Stefan Hajnoczi <=
- [Qemu-devel] [PULL v3 24/38] qemu-iotests: Add supported os parameter for python tests, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 26/38] qemu-thread: add per-thread atexit functions, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 29/38] coroutine: rewrite pool to avoid mutex, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 30/38] coroutine: drop qemu_coroutine_adjust_pool_size, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 27/38] test-coroutine: avoid overflow on 32-bit systems, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 28/38] QSLIST: add lock-free operations, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 32/38] block: limited request size in write zeroes unsupported path, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 33/38] block: Split BLOCK_OP_TYPE_COMMIT to BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET}, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 31/38] coroutine: try harder not to delete coroutines, Stefan Hajnoczi, 2015/01/13
- [Qemu-devel] [PULL v3 34/38] ide: Implement VPD response for ATAPI, Stefan Hajnoczi, 2015/01/13