qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 2/5] util: introduce threaded workqueue


From: Xiao Guangrong
Subject: Re: [Qemu-devel] [PATCH v3 2/5] util: introduce threaded workqueue
Date: Tue, 27 Nov 2018 16:29:05 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1



On 11/27/18 2:49 AM, Emilio G. Cota wrote:
On Mon, Nov 26, 2018 at 16:06:37 +0800, Xiao Guangrong wrote:
+    /* after the user fills the request, the bit is flipped. */
+    uint64_t request_fill_bitmap QEMU_ALIGNED(SMP_CACHE_BYTES);
+    /* after handles the request, the thread flips the bit. */
+    uint64_t request_done_bitmap QEMU_ALIGNED(SMP_CACHE_BYTES);

Use DECLARE_BITMAP, otherwise you'll get type errors as David
pointed out.

If we do it, the field becomes a pointer... that complicates the
thing.

Not necessarily, see below.

On Mon, Nov 26, 2018 at 16:18:24 +0800, Xiao Guangrong wrote:
On 11/24/18 8:17 AM, Emilio G. Cota wrote:
On Thu, Nov 22, 2018 at 15:20:25 +0800, address@hidden wrote:
+static uint64_t get_free_request_bitmap(Threads *threads, ThreadLocal *thread)
+{
+    uint64_t request_fill_bitmap, request_done_bitmap, result_bitmap;
+
+    request_fill_bitmap = atomic_rcu_read(&thread->request_fill_bitmap);
+    request_done_bitmap = atomic_rcu_read(&thread->request_done_bitmap);
+    bitmap_xor(&result_bitmap, &request_fill_bitmap, &request_done_bitmap,
+               threads->thread_requests_nr);

This is not wrong, but it's a big ugly. Instead, I would:

- Introduce bitmap_xor_atomic in a previous patch
- Use bitmap_xor_atomic here, getting rid of the rcu reads

Hmm, however, we do not need atomic xor operation here... that should be slower 
than
just two READ_ONCE calls.

If you use DECLARE_BITMAP, you get an in-place array. On a 64-bit
host, that'd be
        unsigned long foo[1]; /* [2] on 32-bit */

Then again on 64-bit hosts, bitmap_xor_atomic would reduce
to 2 atomic reads:

static inline void bitmap_xor_atomic(unsigned long *dst,
const unsigned long *src1, const unsigned long *src2, long nbits)
{
     if (small_nbits(nbits)) {
         *dst = atomic_read(src1) ^ atomic_read(&src2);
     } else {
         slow_bitmap_xor_atomic(dst, src1, src2, nbits);

We needn't do inplace xor operation. i.e, we just fetch the bitmaps to
the local variables do xor locally.

So we need additional complicity to handle the case that is !small_nbits(nbits)
... but it is really not a big deal as you said, it just couple of codes.

However, use u64 for the purpose that only  64 indexes are allowed is more
straightforward and can be naturally understood. :)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]