qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PULL 21/35] block: fix QEMU crash with sc


From: Eric Blake
Subject: Re: [Qemu-block] [Qemu-devel] [PULL 21/35] block: fix QEMU crash with scsi-hd and drive_del
Date: Wed, 8 Aug 2018 09:53:43 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0

On 08/08/2018 09:32 AM, Vladimir Sementsov-Ogievskiy wrote:

What's more, in commit f140e300, we specifically called out in the commit message that maybe it was better to trace when we detect connection closed rather than log it to stdout, and in all cases in that commit, the additional 'Connection closed' messages do not add any information to the error message already displayed by the rest of the code.


Ok, agree, I'll do it in reconnect series.



hmm, do what?

I was going to change these error messages to be traces, but now I'm not sure that it's a good idea.

Traces are fine. They won't show up in iotests, but will show up when debugging a failed connection.

We have generic errp returned from the function, and why to drop it from logs?

Because it is redundant with the very next line already in the log. Any error encountered when trying to write to a disconnected server is redundant with an already-reported error due to detecting EOF on reading from the server.

Fixing iotest is not a good reason, better is to adjust iotest itself a bit (just commit changed output) and forget about it. Is iotest racy itself, did you see different output running 83 iotest, not testing by hand?

The condition for the output of the 'Connection closed' message is racy - it depends entirely on the timing of whether the client was able to send() a read request to the server prior to the server disconnecting immediately after negotiation ended. If the client loses the race and detects the server hangup prior to writing anything, you get one path; if the client wins the race and successfully writes the request and only later learns that the server has disconnected when trying to read the response to that request, you get the other path. The window for the race changed (and the iotests did not seem to ever expose it short of this particular change to the block layer to do an extra drain), but I could still imagine scenarios where iotests will trigger the opposite path of the race from what is expected, depending on load, since I don't see any synchronization points between the two processes where the server is hanging up after negotiation without reading the client's request, but where the client may or may not have had time to get its request sent to the server's queue.

So, just because I have not seen the iotest fail directly because of a race, I think that this commit causing failures in the iotest is evidence that the test is not robust with those extra 'Connection closed' messages being output. Switching the output to be a trace instead should be just fine; overall, the client's attempt to read when the server hangs up will be an EIO failure whether or not the client was able to send() its request and merely fails to get a reply (server disconnect was slow), or whether the client was not even able to send() its request (server disconnect was fast).

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]