> I don't think this works well enough: if the source isn't removed, but
> fd_can_read() returns 0, there is a potential busy loop on the next
This shouldn't happen. The sources all get set to non-blocking mode, the only
blocking code is the poll itself. If fd_can_read() returns 0, then the next time
fd_read() is called, it will attempt to read zero bytes. The backend logic checks the
results of the same method that fd_can_read() calls and sets its read size
to that amount, in the case of a full buffer it will read 0 bytes and return.
> My understanding is that if data is read from the frontend, the loop
> will be re-entered and io_watch_poll_prepare will set the callback
This just doesn't happen. The issue is that between the poll being added (and
some but not all data being read) and the frontend code getting triggered by
the guest, the IO loop runs again and the poll is removed, it then runs again
with the poll removed (since the poll is removed during setup) and it's now
just going to block because the input fd in question has been "temporarily
removed". Except that nothing in the fd set it polls on is now connected to
the guest clearing the buffer.
Meanwhile the guest reads the data during what can be a potentially
infinite block (if nothing else sets the timeout, in my case something
in the uart peripheral sets a 1000ms timeout so I could read a byte
every second or so in the guest). The guest will now be spinning until
the poll is re-added, meanwhile the poll is blocking on a timeout or another
fd becoming ready because the buffers are small, the fd in question has
already been removed from the set by the time the guest has a chance
to clear the buffer.
> Could you provide a simple use-case or reproducer where we can
> evaluate how your patch improves the situation?
I can do this, but I don't have anything ready immediately, my test case isn't
ideal for others to reproduce. But I can attach one later today when I have that done.