Why is this device even messing around with multiple
threads and atomics anyway ??
Because it is an example of deferring device work to another thread,
just like on real hardware it may be deferred to an on-device
microcontroller or CPU.
If we want to be able to do that, we should probably have
infrastructure and higher-level primitives for it that
don't require device authors to be super-familiar with
QEMU's memory model and barriers... The fact there are only
half a dozen other uses of qemu_thread_create() under hw/
suggests that in practice we don't really need to do this
very often, though.