|
From: | Richard Henderson |
Subject: | Re: [PATCH v2 8/9] async: update documentation of the memory barriers |
Date: | Wed, 8 Mar 2023 08:47:12 -0800 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 |
On 3/8/23 02:49, Paolo Bonzini wrote:
On 3/7/23 18:26, Richard Henderson wrote:On 3/7/23 09:00, Paolo Bonzini wrote:while QSLIST_REMOVE_HEAD in the dequeuing thread is not ordered at all: y.store(0, mo_relaxed); // QSLIST_REMOVE_HEAD x.store(0, mo_release); // fetch_andAs I read aio_bh_queue, this is exactly the situation you describe in patch 1 justifying the introduction of the new barriers.Only store-store reordering is required between QSLIST_REMOVE_HEAD and atomic_fetch_and(); and that one *is* blocked by atomic_fetch_and(), since mo_seq_cst is a superset of both mo_release. The new barriers are needed for store-load reordering.In patch 1 you say: # in C11, sequentially consistent atomics (except for seq-cst fences) # only affect the ordering of sequentially consistent operations. and the store in QSLIST_REMOVE_HEAD is not a sequentially consistent operation. Therefore by your own logic we must have a separate barrier here.You're right that the comment is contradictory. It's the comment that is wrong. The right text should be ---in C11, with the exception of seq-cst fences, the order established by sequentially consistent atomics does not propagate to other memory accesses on either side of the seq-cst atomic. As far as those are concerned, loads performed by a seq-cst atomic are just acquire loads, and stores are just release stores. Even though loads that occur after a RMW operation cannot move above the load, they can still sneak above the store!---
Ok, thanks.
I wonder if your definition/description of smp_mb__before_rmw() isn't actively misleading in this case.- We could drop it entirely and be less confusing, by not having to explain it. - We could define it as signal_barrier() for all hosts, simply to fix the compiler-theoretic reordering problem.The case that I was imagining for smp_mb__before_rmw() is something like this: wake_me = true; smp_mb__before_rmw(); if (qatomic_xchg(&can_sleep, true)) { ... } where you really need a full barrier.
What is different about this that doesn't apply to the remove-head case we've been talking about?
r~
[Prev in Thread] | Current Thread | [Next in Thread] |