qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 3/8] block: add image streaming block job


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH 3/8] block: add image streaming block job
Date: Wed, 2 Nov 2011 15:43:49 +0000

On Tue, Nov 1, 2011 at 6:06 PM, Marcelo Tosatti <address@hidden> wrote:
> On Thu, Oct 27, 2011 at 04:22:50PM +0100, Stefan Hajnoczi wrote:
>> +static int stream_one_iteration(StreamBlockJob *s, int64_t sector_num,
>> +                                void *buf, int max_sectors, int *n)
>> +{
>> +    BlockDriverState *bs = s->common.bs;
>> +    int ret;
>> +
>> +    trace_stream_one_iteration(s, sector_num, max_sectors);
>> +
>> +    ret = bdrv_is_allocated(bs, sector_num, max_sectors, n);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>
> bdrv_is_allocated is still synchronous? If so, there should be at least
> a plan to make it asynchronous.

Yes, that's a good discussion to have.  My thoughts are that
bdrv_is_allocated() should be executed in coroutine context.  The
semantics are a little tricky because of parallel requests:

1. If a write request is in progress when we do bdrv_is_allocated() we
might get back "unallocated" even though clusters are just being
allocated.
2. If a TRIM request is in progress when we do bdrv_is_allocated() we
might get back "allocated" even though clusters are just being
deallocated.

In order to be reliable the caller needs to be aware of parallel
requests.  I think it's correct to defer this problem to the caller.

In the case of image streaming we're not TRIM-safe, I haven't really
thought about it yet.  But we are safe against parallel write requests
because there is serialization to prevent copy-on-read requests from
racing with write requests.

>> +    if (!ret) {
>> +        ret = stream_populate(bs, sector_num, *n, buf);
>> +    }
>> +    return ret;
>> +}
>> +
>> +static void coroutine_fn stream_run(void *opaque)
>> +{
>> +    StreamBlockJob *s = opaque;
>> +    BlockDriverState *bs = s->common.bs;
>> +    int64_t sector_num, end;
>> +    int ret = 0;
>> +    int n;
>> +    void *buf;
>> +
>> +    buf = qemu_blockalign(bs, STREAM_BUFFER_SIZE);
>> +    s->common.len = bdrv_getlength(bs);
>> +    bdrv_get_geometry(bs, (uint64_t *)&end);
>> +
>> +    bdrv_set_zero_detection(bs, true);
>> +    bdrv_set_copy_on_read(bs, true);
>
> Should distinguish between stream initiated and user initiated setting
> of zero detection and cor (so that unsetting below does not clear
> user settings).

For zero detection I agree.

For copy-on-read it is questionable since once streaming is complete
it does not make sense to have copy-on-read enabled.

I will address this in the next revision and think more about the
copy-on-read case.

>> +
>> +    for (sector_num = 0; sector_num < end; sector_num += n) {
>> +        if (block_job_is_cancelled(&s->common)) {
>> +            break;
>> +        }
>
> If cancellation is seen here in the last loop iteration,
> bdrv_change_backing_file below should not be executed.

I documented this case in the QMP API.  I'm not sure if it's possible
to guarantee that the operation isn't just completing as you cancel
it.  Any blocking point between completion of the last iteration and
completing the operation is vulnerable to missing the cancel.  It's
easier to explicitly say the operation might just have completed when
you canceled, rather than trying to protect the completion path.  Do
you think it's a problem to have these loose semantics that I
described?

>> +
>> +        /* TODO rate-limit */
>> +        /* Note that even when no rate limit is applied we need to yield 
>> with
>> +         * no pending I/O here so that qemu_aio_flush() is able to return.
>> +         */
>> +        co_sleep_ns(rt_clock, 0);
>
> How do you plan to implement rate limit?

It was implemented in the QED-specific image streaming series:

http://repo.or.cz/w/qemu/stefanha.git/commitdiff/22f2c09d2fcfe5e49ac4604fd23e4744f549a476

That implementation works fine and is small but I'd like to reuse the
migration speed limit, if possible.  That way we don't have 3
different rate-limiting implementations in QEMU :).

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]