[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v4 2/3] qcow2: Use QcowCache
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [PATCH v4 2/3] qcow2: Use QcowCache |
Date: |
Mon, 24 Jan 2011 15:39:04 +0000 |
On Mon, Jan 24, 2011 at 3:36 PM, Kevin Wolf <address@hidden> wrote:
> Am 24.01.2011 16:26, schrieb Stefan Hajnoczi:
>> On Mon, Jan 24, 2011 at 2:54 PM, Kevin Wolf <address@hidden> wrote:
>>> [ Re-adding qemu-devel to CC ]
>>>
>>> Am 24.01.2011 15:34, schrieb Stefan Hajnoczi:
>>>> On Thu, Jan 20, 2011 at 5:10 PM, Kevin Wolf <address@hidden> wrote:
>>>>> @@ -702,17 +622,30 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState
>>>>> *bs, QCowL2Meta *m)
>>>>>
>>>>> if (m->nb_available & (s->cluster_sectors - 1)) {
>>>>> uint64_t end = m->nb_available & ~(uint64_t)(s->cluster_sectors -
>>>>> 1);
>>>>> + cow = true;
>>>>> ret = copy_sectors(bs, start_sect + end, cluster_offset + (end <<
>>>>> 9),
>>>>> m->nb_available - end, s->cluster_sectors);
>>>>> if (ret < 0)
>>>>> goto err;
>>>>> }
>>>>>
>>>>> - /* update L2 table */
>>>>> + /*
>>>>> + * Update L2 table.
>>>>> + *
>>>>> + * Before we update the L2 table to actually point to the new
>>>>> cluster, we
>>>>> + * need to be sure that the refcounts have been increased and COW was
>>>>> + * handled.
>>>>> + */
>>>>> + if (cow) {
>>>>> + bdrv_flush(bs->file);
>>>>
>>>> Just bdrv_flush(bs->file) and not a refcounts cache flush?
>>>
>>> Refcounts and data need not to be ordered against each other. They only
>>> must both be on disk when we write the L2 table.
>>
>> Have I missed where refcounts actually get flushed from the cache out
>> to the disk because bdrv_flush(bs->file) only syncs the file but
>> doesn't write out dirty data held in cache.
>
> The bdrv_flush isn't supposed to flush the refcounts, but the data
> copied during COW (what you call pre/postfill in QED)
>
> The refcounts are handled by the qcow2_cache_set_dependency below, so
> that before writing the L2 tables we always write the refcounts first.
>
>>>>> + }
>>>>> +
>>>>> + qcow2_cache_set_dependency(bs, s->l2_table_cache,
>>>>> s->refcount_block_cache);
>>>>> ret = get_cluster_table(bs, m->offset, &l2_table, &l2_offset,
>>>>> &l2_index);
>>>>> if (ret < 0) {
>>>>> goto err;
>>>>> }
>>>>> + qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
>>>>>
>>>>> for (i = 0; i < m->nb_clusters; i++) {
>>>>> /* if two concurrent writes happen to the same unallocated cluster
>>>>> @@ -728,16 +661,9 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState
>>>>> *bs, QCowL2Meta *m)
>>>>> (i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
>>>>> }
>>>>>
>>>>> - /*
>>>>> - * Before we update the L2 table to actually point to the new
>>>>> cluster, we
>>>>> - * need to be sure that the refcounts have been increased and COW was
>>>>> - * handled.
>>>>> - */
>>>>> - bdrv_flush(bs->file);
>>>>>
>>>>> - ret = write_l2_entries(bs, l2_table, l2_offset, l2_index,
>>>>> m->nb_clusters);
>>>>> + ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
>>>>> if (ret < 0) {
>>>>> - qcow2_l2_cache_reset(bs);
>>>>> goto err;
>>>>> }
>>>>>
>>>>
>>>> The function continues like this:
>>>>
>>>> /*
>>>> * If this was a COW, we need to decrease the refcount of the old cluster.
>>>> * Also flush bs->file to get the right order for L2 and refcount update.
>>>> */
>>>> if (j != 0) {
>>>> bdrv_flush(bs->file);
>>>> for (i = 0; i < j; i++) {
>>>> qcow2_free_any_clusters(bs,
>>>> be64_to_cpu(old_cluster[i]) & ~QCOW_OFLAG_COPIED, 1);
>>>> }
>>>> }
>>>>
>>>> Does bdrv_flush(bs->file) "get the right order for L2 and refcount
>>>> update"? We've just put an L2 table, should this be an L2 table
>>>> flush?
>>>
>>> I agree, this looks wrong. What we need is a dependency to ensure that
>>> first L2 is flushed and then the refcount block.
>>> qcow2_free_any_clusters() calls update_refcount() indirectly, which
>>> takes care of setting this dependency.
>>>
>>> So in the end I think it's just an unnecessary bdrv_flush. Makes sense?
>>
>> I don't understand this fully. I've noticed that the cache isn't the
>> only mechanism for making changes to tables - there are functions like
>> write_l2_entries() that directly write out parts of tables without
>> honoring dependencies or using the dirty bit on the cache entry. I
>> probably need to look at this more carefully to fully understand
>> whether or not it is correct.
>
> No, the cache should be the only mechanism that is used for accessing L2
> tables and refcount blocks. write_l2_entries() is an old function that
> is removed by the patch.
>
> Direct accesses should only be left for L1 tables and refcount tables.
Agreed. I was switching between branches and had looked at a
non-qcowcache world!
Stefan