Re: [PATCH 1/4] qcow2: Improve refcount structure rebuilding

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/4] qcow2: Improve refcount structure rebuilding

From:	Vladimir Sementsov-Ogievskiy
Subject:	Re: [PATCH 1/4] qcow2: Improve refcount structure rebuilding
Date:	Fri, 26 Mar 2021 17:38:28 +0300
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1

26.03.2021 16:47, Max Reitz wrote:

On 26.03.21 12:48, Vladimir Sementsov-Ogievskiy wrote:

10.03.2021 18:59, Max Reitz wrote:

When rebuilding the refcount structures (when qemu-img check -r found
errors with refcount = 0, but reference count > 0), the new refcount
table defaults to being put at the image file end[1].  There is no good
reason for that except that it means we will not have to rewrite any
refblocks we already wrote to disk.

Changing the code to rewrite those refblocks is not too difficult,
though, so let us do that.  That is beneficial for images on block
devices, where we cannot really write beyond the end of the image file.

[1] Unless there is something allocated in the area pointed to by the
     last refblock, so we have to write that refblock.  In that case, we
     try to put the reftable in there.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1519071
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
  block/qcow2-refcount.c | 126 ++++++++++++++++++++++-------------------
  1 file changed, 67 insertions(+), 59 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 8e649b008e..162caeeb8e 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -2352,8 +2352,9 @@ static int rebuild_refcount_structure(BlockDriverState 
*bs,
                                        int64_t *nb_clusters)
  {
      BDRVQcow2State *s = bs->opaque;
-    int64_t first_free_cluster = 0, reftable_offset = -1, cluster = 0;
+    int64_t first_free_cluster = 0, reftable_offset = -1, cluster;
      int64_t refblock_offset, refblock_start, refblock_index;
+    int64_t first_cluster, end_cluster;
      uint32_t reftable_size = 0;
      uint64_t *on_disk_reftable = NULL;
      void *on_disk_refblock;
@@ -2365,8 +2366,11 @@ static int rebuild_refcount_structure(BlockDriverState 
*bs,
      qcow2_cache_empty(bs, s->refcount_block_cache);
+    first_cluster = 0;
+    end_cluster = *nb_clusters;
+
  write_refblocks:
-    for (; cluster < *nb_clusters; cluster++) {
+    for (cluster = first_cluster; cluster < end_cluster; cluster++) {
          if (!s->get_refcount(*refcount_table, cluster)) {
              continue;
          }
@@ -2374,65 +2378,68 @@ write_refblocks:
          refblock_index = cluster >> s->refcount_block_bits;
          refblock_start = refblock_index << s->refcount_block_bits;
-        /* Don't allocate a cluster in a refblock already written to disk */
-        if (first_free_cluster < refblock_start) {
-            first_free_cluster = refblock_start;
-        }
-        refblock_offset = alloc_clusters_imrt(bs, 1, refcount_table,
-                                              nb_clusters, 
&first_free_cluster);
-        if (refblock_offset < 0) {
-            fprintf(stderr, "ERROR allocating refblock: %s\n",
-                    strerror(-refblock_offset));
-            res->check_errors++;
-            ret = refblock_offset;
-            goto fail;
-        }
-
-        if (reftable_size <= refblock_index) {
-            uint32_t old_reftable_size = reftable_size;
-            uint64_t *new_on_disk_reftable;
+        if (reftable_size > refblock_index &&
+            on_disk_reftable[refblock_index])
+        {
+            refblock_offset = on_disk_reftable[refblock_index];


In this branch, we assign it to ..

+        } else {
+            int64_t refblock_cluster_index;
-            reftable_size = ROUND_UP((refblock_index + 1) * 
REFTABLE_ENTRY_SIZE,
-                                     s->cluster_size) / REFTABLE_ENTRY_SIZE;
-            new_on_disk_reftable = g_try_realloc(on_disk_reftable,
-                                                 reftable_size *
-                                                 REFTABLE_ENTRY_SIZE);
-            if (!new_on_disk_reftable) {
+            /* Don't allocate a cluster in a refblock already written to disk 
*/
+            if (first_free_cluster < refblock_start) {
+                first_free_cluster = refblock_start;
+            }
+            refblock_offset = alloc_clusters_imrt(bs, 1, refcount_table,
+                                                  nb_clusters,
+                                                  &first_free_cluster);
+            if (refblock_offset < 0) {
+                fprintf(stderr, "ERROR allocating refblock: %s\n",
+                        strerror(-refblock_offset));
                  res->check_errors++;
-                ret = -ENOMEM;
+                ret = refblock_offset;
                  goto fail;
              }
-            on_disk_reftable = new_on_disk_reftable;
-            memset(on_disk_reftable + old_reftable_size, 0,
-                   (reftable_size - old_reftable_size) * REFTABLE_ENTRY_SIZE);
+            refblock_cluster_index = refblock_offset / s->cluster_size;
+            if (refblock_cluster_index >= end_cluster) {
+                /*
+                 * We must write the refblock that holds this refblock's
+                 * refcount
+                 */
+                end_cluster = refblock_cluster_index + 1;
+            }
-            /* The offset we have for the reftable is now no longer valid;
-             * this will leak that range, but we can easily fix that by running
-             * a leak-fixing check after this rebuild operation */
-            reftable_offset = -1;
-        } else {
-            assert(on_disk_reftable);
-        }
-        on_disk_reftable[refblock_index] = refblock_offset;
+            if (reftable_size <= refblock_index) {
+                uint32_t old_reftable_size = reftable_size;
+                uint64_t *new_on_disk_reftable;
+
+                reftable_size =
+                    ROUND_UP((refblock_index + 1) * REFTABLE_ENTRY_SIZE,
+                             s->cluster_size) / REFTABLE_ENTRY_SIZE;
+                new_on_disk_reftable =
+                    g_try_realloc(on_disk_reftable,
+                                  reftable_size * REFTABLE_ENTRY_SIZE);
+                if (!new_on_disk_reftable) {
+                    res->check_errors++;
+                    ret = -ENOMEM;
+                    goto fail;
+                }
+                on_disk_reftable = new_on_disk_reftable;
-        /* If this is apparently the last refblock (for now), try to squeeze 
the
-         * reftable in */
-        if (refblock_index == (*nb_clusters - 1) >> s->refcount_block_bits &&
-            reftable_offset < 0)
-        {
-            uint64_t reftable_clusters = size_to_clusters(s, reftable_size *
- REFTABLE_ENTRY_SIZE);
-            reftable_offset = alloc_clusters_imrt(bs, reftable_clusters,
-                                                  refcount_table, nb_clusters,
-                                                  &first_free_cluster);
-            if (reftable_offset < 0) {
-                fprintf(stderr, "ERROR allocating reftable: %s\n",
-                        strerror(-reftable_offset));
-                res->check_errors++;
-                ret = reftable_offset;
-                goto fail;
+                memset(on_disk_reftable + old_reftable_size, 0,
+                       (reftable_size - old_reftable_size) *
+                       REFTABLE_ENTRY_SIZE);
+
+                /*
+                 * The offset we have for the reftable is now no longer valid;
+                 * this will leak that range, but we can easily fix that by
+                 * running a leak-fixing check after this rebuild operation
+                 */
+                reftable_offset = -1;
+            } else {
+                assert(on_disk_reftable);
              }
+            on_disk_reftable[refblock_index] = refblock_offset;


only to write back again ?


This assignment is on a deeper level, though, isn it?  I.e. it’s inside the 
else branch from above.


Ahm right.

I looked at it in vimdiff, and modifed pre-patch alignment so that it looks 
good.. but something went wrong.

          }
          ret = qcow2_pre_write_overlap_check(bs, 0, refblock_offset,
@@ -2459,15 +2466,12 @@ write_refblocks:
      }
      if (reftable_offset < 0) {


at this point reftable_offset is always -1 now..

Ah not. and now I am a bit close to understanding all of this logic. this thing with 
"goto write_refblocks" is not obvious

-        uint64_t post_refblock_start, reftable_clusters;
+        uint64_t reftable_clusters;
-        post_refblock_start = ROUND_UP(*nb_clusters, s->refcount_block_size);
          reftable_clusters =
              size_to_clusters(s, reftable_size * REFTABLE_ENTRY_SIZE);
-        /* Not pretty but simple */
-        if (first_free_cluster < post_refblock_start) {
-            first_free_cluster = post_refblock_start;
-        }
+
+        first_free_cluster = 0;
          reftable_offset = alloc_clusters_imrt(bs, reftable_clusters,
                                                refcount_table, nb_clusters,
                                                &first_free_cluster);
@@ -2479,6 +2483,10 @@ write_refblocks:
              goto fail;
          }
+        assert(offset_into_cluster(s, reftable_offset) == 0);
+        first_cluster = reftable_offset / s->cluster_size;
+        end_cluster = first_cluster + reftable_clusters;
+
          goto write_refblocks;


these three lines now looks like a function call in assembler :)


You didn't ping the series (more than two week old) so, I'm not sure that you are not 
preparing v2 now.. But I kept it "marked unred" all this time, and several 
times tried to look at it, and postponed, because I don't familiar with this place of 
qcow2 driver. And the function looks too difficult. Now finally I think I understand most 
of the logic that you change. Honestly, I think a bit of refactoring the 
rebuild_refcount_structure() prior to logic change would make it a lot easier to review..


Hm, yes, putting the for () loop into its own function would probably be a good 
starting put.  I’ll look into it.


Thanks!


--
Best regards,
Vladimir

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH 0/4] qcow2: Improve refcount structure rebuilding, Max Reitz, 2021/03/10
- [PATCH 1/4] qcow2: Improve refcount structure rebuilding, Max Reitz, 2021/03/10
  - Re: [PATCH 1/4] qcow2: Improve refcount structure rebuilding, Vladimir Sementsov-Ogievskiy, 2021/03/26
    - Re: [PATCH 1/4] qcow2: Improve refcount structure rebuilding, Max Reitz, 2021/03/26
    - Re: [PATCH 1/4] qcow2: Improve refcount structure rebuilding, Vladimir Sementsov-Ogievskiy <=
- [PATCH 2/4] iotests/common.qemu: Add _cleanup_single_qemu, Max Reitz, 2021/03/10
- [PATCH 4/4] iotests/108: Test new refcount rebuild algorithm, Max Reitz, 2021/03/10
  - Re: [PATCH 4/4] iotests/108: Test new refcount rebuild algorithm, Eric Blake, 2021/03/10
- [PATCH 3/4] iotests/common.qemu: Allow using the QSD, Max Reitz, 2021/03/10

Prev by Date: Re: [PATCH v2 1/3] hw/display/bcm2835_fb: Resize console on reset
Next by Date: Re: [PATCH 2/2] Support monitor chardev hotswap with QMP
Previous by thread: Re: [PATCH 1/4] qcow2: Improve refcount structure rebuilding
Next by thread: [PATCH 2/4] iotests/common.qemu: Add _cleanup_single_qemu
Index(es):
- Date
- Thread