[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to blo
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver |
Date: |
Thu, 20 Jun 2013 14:11:33 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Wed, Jun 19, 2013 at 12:50:16PM +0200, Kevin Wolf wrote:
> Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> > + DPRINTF("%s enter %s C%" PRId64 " %" PRId64 " %d\n",
> > + __func__, bdrv_get_device_name(bs), start, sector_num,
> > nb_sectors);
>
> Maybe put the first "%s" and __func__ directly into the DPRINTF macro?
Will use trace events instead.
> > + wait_for_overlapping_requests(job, start, end);
> > + cow_request_begin(&cow_request, job, start, end);
> > +
> > + total_sectors = bdrv_getlength(bs);
> > + if (total_sectors < 0) {
> > + if (error_is_read) {
> > + *error_is_read = true;
> > + }
> > + ret = total_sectors;
> > + goto out;
> > + }
> > + total_sectors /= BDRV_SECTOR_SIZE;
>
> Why is this needed? There are two callers of the function: One is the
> write notifier, which has already a sector number that is checked
> against the image size. The other one is background job that gets the
> size once when it starts. I don't think there's a reason to call the
> block driver each time and potentially do an expensive request.
>
> At first I thought it has something to do with resizing images, but it's
> forbidden while a block job is running (otherwise the job's main loop
> exit condition would be wrong, too), so that doesn't make it necessary.
Thanks, I will eliminate the call.
> > + /* Publish progress */
> > + job->sectors_read += n;
> > + job->common.offset += n * BDRV_SECTOR_SIZE;
>
> This is interesting, because the function is not only called by the
> background job, but also by write notifiers. So 'offset' in a literal
> sense doesn't make too much sense because we're not operating purely
> sequential.
>
> The QAPI documentation describes 'offset' like this:
>
> # @offset: the current progress value
>
> If we take it as just that, I think we could actually consider this code
> correct, because it's a useful measure for the progress (each sector is
> copied only once, either by the job or by a notifier), even though it
> really has nothing to do with an offset into the image.
>
> Maybe a comment would be appropriate.
Will add the comment.
> > +static BlockJobType backup_job_type = {
> > + .instance_size = sizeof(BackupBlockJob),
> > + .job_type = "backup",
> > + .set_speed = backup_set_speed,
> > + .iostatus_reset = backup_iostatus_reset,
> > +};
>
> Align = on the same column? Should probably be const, too.
Thanks for pointing the const out. I'll throw in alignment as a bonus,
there is a team of ASCII artists here who do amazing whitespace work ;).
> > +static void coroutine_fn backup_run(void *opaque)
> > +{
> > + BackupBlockJob *job = opaque;
> > + BlockDriverState *bs = job->common.bs;
> > + BlockDriverState *target = job->target;
> > + BlockdevOnError on_target_error = job->on_target_error;
> > + NotifierWithReturn before_write = {
> > + .notify = backup_before_write_notify,
> > + };
> > + int64_t start, end;
> > + int ret = 0;
> > +
> > + QLIST_INIT(&job->inflight_reqs);
> > + qemu_co_rwlock_init(&job->flush_rwlock);
> > +
> > + start = 0;
> > + end = DIV_ROUND_UP(bdrv_getlength(bs) / BDRV_SECTOR_SIZE,
>
> bdrv_getlength() can fail.
Will fix.
> > + BACKUP_SECTORS_PER_CLUSTER);
> > +
> > + job->bitmap = hbitmap_alloc(end, 0);
> > +
> > + bdrv_set_on_error(target, on_target_error, on_target_error);
> > + bdrv_iostatus_enable(target);
>
> Mirroring also has:
>
> bdrv_set_enable_write_cache(s->target, true);
>
> Wouldn't it make sense for backup as well?
I guess so.
> > + /* we need to yield so that qemu_aio_flush() returns.
> > + * (without, VM does not reboot)
> > + */
> > + if (job->common.speed) {
> > + uint64_t delay_ns = ratelimit_calculate_delay(
> > + &job->limit, job->sectors_read);
> > + job->sectors_read = 0;
> > + block_job_sleep_ns(&job->common, rt_clock, delay_ns);
>
> Here's the other implication of counting the progress of copies
> initiated by write notifiers: The more copies they trigger, the less
> additional copies are made by the background job.
>
> I'm willing to count this as a feature rather than a bug.
Yes, the guest does not get throttled by the block job.
> > + } else {
> > + block_job_sleep_ns(&job->common, rt_clock, 0);
> > + }
> > +
> > + if (block_job_is_cancelled(&job->common)) {
> > + break;
> > + }
> > +
> > + DPRINTF("backup_run loop C%" PRId64 "\n", start);
> > +
> > + ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER, 1,
> > + &error_is_read);
>
> You're taking advantage of the fact that backup_do_cow() rounds this one
> sector up to 64k, which is a much more reasonable size. But why not
> specify BACKUP_SECTORS_PER_CLUSTER as nb_sectors when this is what you
> really assume?
Sounds good though I need to double-check if we run into issues when
hitting the end of the disk (if not aligned to
BACKUP_SECTORS_PER_CLUSTER).
- Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver, (continued)