[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] block/iscsi: handle BUSY condition
From: |
Peter Lieven |
Subject: |
Re: [Qemu-devel] [PATCH] block/iscsi: handle BUSY condition |
Date: |
Thu, 29 May 2014 20:41:05 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 |
Am 29.05.2014 19:47, schrieb Paolo Bonzini:
> Il 29/05/2014 19:31, Peter Lieven ha scritto:
>> +static const unsigned iscsi_retry_times[] = {4, 16, 64, 256, 1024, 4096};
>
> This is a pretty fast growth.
I know, I wanted to end up with a total retry period of approx. 10 seconds
before failing.
As we allow only for 5 retries smaller growth would result in less total time.
>
>> +static void iscsi_retry_timer_expired(void *opaque)
>> +{
>> + struct IscsiTask *iTask = opaque;
>> + timer_del(iTask->retry_timer);
>
> This timer_del is not necessary. Also, since you are introducing a new usage
> I would prefer if you used aio_timer_init instead.
Okay, I didn't know that aio_timer_new was deprecated.
timer_del is not necessary because we delete the timer in the callback and can
be sure that it is not armed?
>
>> + timer_free(iTask->retry_timer);
>> + if (iTask->co) {
>> + qemu_coroutine_enter(iTask->co, NULL);
>> + }
>> +}
>> +
>> +#define RANDRANGE(M, N) (M + rand() / (RAND_MAX / (N - M + 1) + 1))
>
> A minor nit: you should probably use "((double)rand() / RAND_MAX) and then
> cast it back to integer at the end. You should also make it a static inline
> function for readability, or surround the arguments with parentheses.
>
> If you care, it probably would be better to have the random numbers be
> exponentially distributed, even more so since the ranges form a geometric
> progression. So for example the first retry would be on average 8 ms rather
> than 10 ms. The difference for the last retry is half a second!
I hope we never get in the situation that we need more than one retry. I just
wanted to give the storage a chance to calm down and in this case a static
value would be either
to high for the first retry or to low for the higher retries if ever reached.
If you want I can include this one here which seems to work:
static inline unsigned geo_rand_range(double m, double n) {
return exp((log(m) + (double)rand() /
(RAND_MAX / (log(n) - log(m) + log(1)) + log(1))));
}
>
> You can do that by taking the (natural) logarithm of N and M, applying
> randrange on the logarithms, and then applying exp on the result.
>
> Otherwise looks good, but how did you test the patch? I'd rather have the
> growth in iscsi_retry_times backed by actual data, but I understand that's
> hard to gather.
I don't know how actually. In my use case a see a single BUSY once every few
hours on a group runnning new storage firmware with a few hundred targets.
For testing this I randomly created BUSY conditions and overwrote the status.
Peter
>
> Paolo
>
>> static void
>> iscsi_co_generic_cb(struct iscsi_context *iscsi, int status,
>> void *command_data, void *opaque)
>> @@ -156,20 +172,40 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int
>> status,
>> iTask->do_retry = 0;
>> iTask->task = task;
>>
>> - if (iTask->retries-- > 0 && status == SCSI_STATUS_CHECK_CONDITION
>> - && task->sense.key == SCSI_SENSE_UNIT_ATTENTION) {
>> - error_report("iSCSI CheckCondition: %s", iscsi_get_error(iscsi));
>> - iTask->do_retry = 1;
>> - goto out;
>> - }
>> -
>> if (status != SCSI_STATUS_GOOD) {
>> + if (iTask->retries++ < ISCSI_CMD_RETRIES) {
>> + if (status == SCSI_STATUS_CHECK_CONDITION
>> + && task->sense.key == SCSI_SENSE_UNIT_ATTENTION) {
>> + error_report("iSCSI CheckCondition: %s",
>> + iscsi_get_error(iscsi));
>> + iTask->do_retry = 1;
>> + goto out;
>> + }
>> + if (status == SCSI_STATUS_BUSY) {
>> + unsigned retry_time =
>> + RANDRANGE(iscsi_retry_times[iTask->retries - 1],
>> + iscsi_retry_times[iTask->retries]);
>> + error_report("iSCSI Busy (retry #%u in %u ms): %s",
>> + iTask->retries, retry_time,
>> + iscsi_get_error(iscsi));
>> + iTask->retry_timer =
>> aio_timer_new(iTask->iscsilun->aio_context,
>> + QEMU_CLOCK_REALTIME,
>> + SCALE_MS,
>> +
>> iscsi_retry_timer_expired,
>> + iTask);
>> + timer_mod(iTask->retry_timer,
>> + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
>> retry_time);
>
- [Qemu-devel] [PATCH] block/iscsi: handle BUSY condition, Peter Lieven, 2014/05/29
- Re: [Qemu-devel] [PATCH] block/iscsi: handle BUSY condition, Paolo Bonzini, 2014/05/29
- Re: [Qemu-devel] [PATCH] block/iscsi: handle BUSY condition,
Peter Lieven <=
- Re: [Qemu-devel] [PATCH] block/iscsi: handle BUSY condition, Paolo Bonzini, 2014/05/29
- Re: [Qemu-devel] [PATCH] block/iscsi: handle BUSY condition, Peter Lieven, 2014/05/29
- Re: [Qemu-devel] [PATCH] block/iscsi: handle BUSY condition, Peter Lieven, 2014/05/29
- Re: [Qemu-devel] [PATCH] block/iscsi: handle BUSY condition, Paolo Bonzini, 2014/05/30
- Re: [Qemu-devel] [PATCH] block/iscsi: handle BUSY condition, Peter Lieven, 2014/05/30
- Re: [Qemu-devel] [PATCH] block/iscsi: handle BUSY condition, Paolo Bonzini, 2014/05/30