[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] libiscsi task cancellation
From: |
Felipe Franciosi |
Subject: |
Re: [Qemu-devel] libiscsi task cancellation |
Date: |
Thu, 8 Feb 2018 15:33:54 +0000 |
> On 8 Feb 2018, at 14:58, Hannes Reinecke <address@hidden> wrote:
>
> On 02/08/2018 03:14 PM, Paolo Bonzini wrote:
>> On 08/02/2018 15:08, Stefan Hajnoczi wrote:
>>> Now on to libiscsi:
>>>
>>> The iscsi_task_mgmt_async() API documentation says:
>>>
>>> * abort_task will also cancel the scsi task. The callback for the
>>> scsi task will be invoked with
>>> * SCSI_STATUS_CANCELLED
>>>
>>> I see that the ABORT TASK TMF response invokes the user's
>>> iscsi_task_mgmt_async() callback but not the command callback. I'm
>>> not sure how the command callback is invoked with
>>> SCSI_STATUS_CANCELLED unless libiscsi is relying on the target to send
>>> that response.
>>>
>>> Is libiscsi honoring its iscsi_task_mgmt_async() contract?
>>
>> No, and QEMU is assuming the "wrong" behavior:
>>
>> static void
>> iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void
>> *command_data,
>> void *private_data)
>> {
>> IscsiAIOCB *acb = private_data;
>>
>> acb->status = -ECANCELED;
>> iscsi_schedule_bh(acb);
>> }
>>
> The definition of ABORT TASK TMF in SAM is pretty much useless.
> To quote:
>
> A response of FUNCTION COMPLETE shall indicate that the task was aborted
> or was not in the task set.
>
> IE we have no idea if we ever managed to abort the task; if the task had
> been in-flight by the time we've send the TMF we'll be getting a
> FUNCTION COMPLETE, too.
Why is that a problem? I am under the impression that drivers can cope with
that. You can complete the TMF after the original request completed
(successfully or not).
> So most FC HBA firmware implement the abort task with just a blacklist;
> the TMF will be returned immediately and the command response will be
> dropped if and when is arrives.
That sounds very dangerous, but maybe it's safe for FC (I don't know details
about it). What happens if you complete the TMF immediately and the OS believes
the IO (eg. a write) has been aborted, then issue another write for the same
LBA and that completes successfully before the original request (which is still
outstanding)?
In our implementation, we sit on the TMF and wait for the original request to
either complete or abort. Only then we respond the TMF to the issuing OS.
Cheers,
Felipe