qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 11/11] block/backup: use backup-top instead o


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH v5 11/11] block/backup: use backup-top instead of write notifiers
Date: Mon, 28 Jan 2019 17:53:58 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0

On 28.01.19 17:44, Vladimir Sementsov-Ogievskiy wrote:
> 28.01.2019 18:59, Max Reitz wrote:
>> On 28.01.19 12:29, Vladimir Sementsov-Ogievskiy wrote:
>>> 18.01.2019 17:56, Max Reitz wrote:
>>>> On 29.12.18 13:20, Vladimir Sementsov-Ogievskiy wrote:

[...]

>>>>> @@ -505,8 +474,20 @@ static int coroutine_fn backup_run(Job *job, Error 
>>>>> **errp)
>>>>>                if (alloced < 0) {
>>>>>                    ret = alloced;
>>>>>                } else {
>>>>> +                if (!hbitmap_get(s->copy_bitmap, offset)) {
>>>>> +                    trace_backup_do_cow_skip(job, offset);
>>>>> +                    continue; /* already copied */
>>>>> +                }
>>>>> +                if (!lock) {
>>>>> +                    lock = bdrv_co_try_lock(s->source, offset, 
>>>>> s->cluster_size);
>>>>> +                    /*
>>>>> +                     * Dirty bit is set, which means that there are no 
>>>>> in-flight
>>>>> +                     * write requests on this area. We must succeed.
>>>>> +                     */
>>>>> +                    assert(lock);
>>>>
>>>> What if I have a different parent node for the source that issues
>>>> concurrent writes?  This used to work fine because the before_write
>>>> notifier would still work.  After this patch, that would be broken
>>>> because those writes would not cause a CbW.
>>>
>>> But haw could you have this different parent node? After appending filter,
>>> there should not be such nodes.
>>
>> Unless you append them afterwards:
>>
>>> And I think, during backup it should be
>>> forbidden to append new parents to source, ignoring filter, as it definitely
>>> breaks what filter does.
>>
>> Agreed, but then this needs to be implemented.
>>
>>> And it applies to other block-job with their filters.
>>> If we appended a filter, we don't want someone other to write omit our 
>>> filter.
>>>
>>>>
>>>> That's not so bad because we just have to make sure that all writes go
>>>> through the backup-top node.  That would make this assertion valid
>>>> again, too.  But that means we cannot share PERM_WRITE; see [1].
>>>
>>> But we don't share PERM_WRITE on source in backup_top, only on target.
>>
>> Are you sure?  The job itself shares it, and the filter shares it, too,
>> as far as I can see.  It uses bdrv_filter_default_perms(), and that does
>> seem to share PERM_WRITE.
> 
> And in bdrv_Filter_default_perms it does "*nshared = *nshared | 
> BLK_PERM_WRITE"
> only for child_file, it is target. Source is child_backing.

Hm?  bdrv_filter_default_perms() does this, unconditionally:

>     *nshared = (shared & DEFAULT_PERM_PASSTHROUGH) |
>                (c->shared_perm & DEFAULT_PERM_UNCHANGED);

The backup_top filter does what you describe, but it just leaves
*nshared untouched after bdrv_filter_default_perms() has done the above.

[...]

>>>>> @@ -655,25 +656,31 @@ BlockJob *backup_job_create(const char *job_id, 
>>>>> BlockDriverState *bs,
>>>>>    
>>>>>        copy_bitmap = hbitmap_alloc(len, ctz32(cluster_size));
>>>>>    
>>>>> -    /* job->len is fixed, so we can't allow resize */
>>>>> -    job = block_job_create(job_id, &backup_job_driver, txn, bs,
>>>>> -                           BLK_PERM_CONSISTENT_READ,
>>>>> -                           BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
>>>>> -                           BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD,
>>>>> -                           speed, creation_flags, cb, opaque, errp);
>>>>> -    if (!job) {
>>>>> +    /*
>>>>> +     * bdrv_get_device_name will not help to find device name starting 
>>>>> from
>>>>> +     * @bs after backup-top append,
>>>>
>>>> Why not?  Since backup-top is appended, shouldn't all parents of @bs be
>>>> parents of @backup_top then?  (Making bdrv_get_parent_name() return the
>>>> same result)
>>>
>>> bdrv_get_device_name goes finally through role->get_name, and only root 
>>> role has
>>> this handler. After append we'll have backing role instead of root.
>>
>> Ah, I see, I asked the wrong question.
>>
>> Why is block_job_create() called on bs and not on backup_top?  mirror
>> calls it on mirror_top_bs.
> 
> Good question. I don't exactly remember, may be there are were more troubles 
> with
> permissions or somthing. So, I've to try it again..
> 
> What is more beneficial?
> 
> My current approach, is that job and filter are two sibling users of source 
> node,
> they do copying, they are synchronized. And in this way, it is better to read 
> from
> source directly, to not create extra intersection between job and filter..
> 
> On the other hand, if we read through the filter, we possible should do the 
> whole
> copy operation through the filter..
> 
> What is the difference between guest read and backup-job read, in filter POV? 
> I think:
> 
> For guest read, filter MUST read (as we must handle guest request), and than, 
> if
> we don't have too much in-flight requests, ram-cache is not full, etc, we can 
> handle
> already read data in terms of backup, so, copy it somewhere. Or we can drop 
> it, if
> we can't handle it at the moment..
> 
> For job read, we even MAY not read, if our queues are full, postponing job 
> request.
> 
> So
> 
> Guest read: MUST read, MAY backup
> Job read: MAY read and backup
> 
> So, reading through filter has a possibility of common code path + native 
> prioritization
> of copy operations. This of course will need more refactoring of backup, and 
> may be done
> as a separate step, but definitely, I have to at least try to create job 
> above the filter.

Well, as far as I see it, right now backup_top's read function is just a
passthrough.  I don't see a functional difference between reading from
backup_top and source, but the fact that you could save yourself the
trouble of figuring out the job ID manually.

As for the RAM cache, I thought it was just a target like any other and
backup_top wouldn't need to care at all...?

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]