qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [PATCH v5 4/5] qed: Read/write support


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] Re: [PATCH v5 4/5] qed: Read/write support
Date: Fri, 3 Dec 2010 20:29:47 +0000

On Fri, Dec 3, 2010 at 4:06 PM, Kevin Wolf <address@hidden> wrote:
> Am 24.11.2010 12:11, schrieb Stefan Hajnoczi:
>> +static void qed_aio_cancel(BlockDriverAIOCB *blockacb)
>> +{
>> +    QEDAIOCB *acb = (QEDAIOCB *)blockacb;
>> +    bool finished = false;
>> +
>> +    /* Wait for the request to finish */
>> +    acb->finished = &finished;
>> +    while (!finished) {
>> +        qemu_aio_wait();
>> +    }
>> +}
>
> Hm, you don't even try to cancel? I wonder how useful the individual
> bdrv_aio_cancel implementations actually are when nobody implements
> cancellation. It seems to be pretty hard to do it right.
>
> Maybe we should consider implementing a default bdrv_aio_cancel
> implementation in block.c that waits for completion?

Cancellation is difficult to implement correctly.  Even
qemu_aio_wait() is bad because IDE/scsi will block until the request
completes.  We really should have something better.

Also from what I can tell only USB gadget fs implements it in Linux
native aio!  In posix-aio-compat.c we can only cancel requests if they
haven't been issued yet.

So in summary, cancel is a nasty interface :).

>> +/**
>> + * Construct an iovec array for a given length
>> + *
>> + * @acb:        I/O request
>> + * @len:        Maximum number of bytes
>> + *
>> + * This function can be called several times to build subset iovec arrays of
>> + * acb->qiov.  For example:
>> + *
>> + *   acb->qiov->iov[] = {{0x100000, 1024},
>> + *                       {0x200000, 1024}}
>> + *
>> + *   qed_acb_build_qiov(acb, 512) =>
>> + *                      {{0x100000, 512}}
>> + *
>> + *   qed_acb_build_qiov(acb, 1024) =>
>> + *                      {{0x100200, 512},
>> + *                       {0x200000, 512}}
>> + *
>> + *   qed_acb_build_qiov(acb, 512) =>
>> + *                      {{0x200200, 512}}
>> + */
>> +static void qed_acb_build_qiov(QEDAIOCB *acb, size_t len)
>> +{
>> +    struct iovec *iov_end = &acb->qiov->iov[acb->qiov->niov];
>> +    size_t iov_offset = acb->cur_iov_offset;
>> +    struct iovec *iov = acb->cur_iov;
>> +
>> +    while (iov != iov_end && len > 0) {
>> +        size_t nbytes = MIN(iov->iov_len - iov_offset, len);
>> +
>> +        qemu_iovec_add(&acb->cur_qiov, iov->iov_base + iov_offset, nbytes);
>> +        iov_offset += nbytes;
>> +        len -= nbytes;
>> +
>> +        if (iov_offset >= iov->iov_len) {
>> +            iov_offset = 0;
>> +            iov++;
>> +        }
>> +    }
>> +
>> +    /* Stash state for next time */
>> +    acb->cur_iov = iov;
>> +    acb->cur_iov_offset = iov_offset;
>> +}
>
> Wouldn't it be much simpler to just store the offset into acb->qiov and
> to use qemu_iovec_copy to get the subset qiov?

Good point, fixed in v6.

>> +/**
>> + * Write data to the image file
>> + */
>> +static void qed_aio_write_main(void *opaque, int ret)
>> +{
>> +    QEDAIOCB *acb = opaque;
>> +    BDRVQEDState *s = acb_to_s(acb);
>> +    uint64_t offset = acb->cur_cluster;
>> +    BlockDriverCompletionFunc *next_fn;
>> +    BlockDriverAIOCB *file_acb;
>> +
>> +    trace_qed_aio_write_main(s, acb, ret, offset, acb->cur_qiov.size);
>
> Why does the trace use a different offset...
>
>> +
>> +    if (ret) {
>> +        qed_aio_complete(acb, ret);
>> +        return;
>> +    }
>> +
>> +    if (acb->find_cluster_ret == QED_CLUSTER_FOUND) {
>> +        next_fn = qed_aio_next_io;
>> +    } else {
>> +        if (s->bs->backing_hd) {
>> +            next_fn = qed_aio_write_flush_before_l2_update;
>> +        } else {
>> +            next_fn = qed_aio_write_l2_update;
>> +        }
>> +    }
>> +
>> +    offset += qed_offset_into_cluster(s, acb->cur_pos);
>
> ...than the write uses?
>
> I missed this line at first because offset is initialized above, so I
> didn't expect that this was only half of its initialization.

Fixed in v6.

>> +/**
>> + * Populate back untouched region of new data cluster
>> + */
>> +static void qed_aio_write_postfill(void *opaque, int ret)
>> +{
>> +    QEDAIOCB *acb = opaque;
>> +    BDRVQEDState *s = acb_to_s(acb);
>> +    uint64_t start = acb->cur_pos + acb->cur_qiov.size;
>> +    uint64_t len = qed_start_of_cluster(s, start + s->header.cluster_size - 
>> 1) - start;
>> +    uint64_t offset = acb->cur_cluster + qed_offset_into_cluster(s, 
>> acb->cur_pos) + acb->cur_qiov.size;
>
> This look like more than 80 characters.

Fixed in v6.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]