qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 09/26] qcow2: Helper for refcount array reall


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH v5 09/26] qcow2: Helper for refcount array reallocation
Date: Wed, 04 Feb 2015 10:57:49 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0

On 2015-02-04 at 08:21, Kevin Wolf wrote:
Am 15.12.2014 um 13:50 hat Max Reitz geschrieben:
Add a helper function for reallocating a refcount array, independent of
the refcount order. The newly allocated space is zeroed and the function
handles failed reallocations gracefully.

The helper function will always align the buffer size to a cluster
boundary; if storing the refcounts in such an array in big endian byte
order, this makes it possible to write parts of the array directly as
refcount blocks into the image file.

Signed-off-by: Max Reitz <address@hidden>
Reviewed-by: Eric Blake <address@hidden>
---
  block/qcow2-refcount.c | 137 +++++++++++++++++++++++++++++++------------------
  1 file changed, 88 insertions(+), 49 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index fd28a13..4ede971 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1130,6 +1130,70 @@ fail:
  /* refcount checking functions */
+static size_t refcount_array_byte_size(BDRVQcowState *s, uint64_t entries)
+{
+    if (s->refcount_order < 3) {
+        /* sub-byte width */
+        int shift = 3 - s->refcount_order;
+        return DIV_ROUND_UP(entries, 1 << shift);
+    } else if (s->refcount_order == 3) {
+        /* byte width */
+        return entries;
+    } else {
+        /* multiple bytes wide */
+
+        /* This assertion holds because there is no way we can address more 
than
+         * 2^(64 - 9) clusters at once (with cluster size 512 = 2^9, and 
because
+         * offsets have to be representable in bytes); due to every cluster
Considering this, which implies that a multiplication by 64 can't
overflow, why can't this function be as simple as the following?

     assert(entries <= (1 << (64 - 9)));
     return DIV_ROUND_UP(entries * s->refcount_bits, 8)

Because that was too simple for me to think of.

I'm not sure whether Eric said something about this before, but I think he said he liked separating the cases...

Anyway, I'll take your suggestion, thanks!

Max

+         * corresponding to one refcount entry and because refcount_order has 
to
+         * be below 7, we are far below that limit */
+        assert(!(entries >> (64 - (s->refcount_order - 3))));
+
+        return entries << (s->refcount_order - 3);
+    }
+}
+
+/**
+ * Reallocates *array so that it can hold new_size entries. *size must contain
+ * the current number of entries in *array. If the reallocation fails, *array
+ * and *size will not be modified and -errno will be returned. If the
+ * reallocation is successful, *array will be set to the new buffer and *size
+ * will be set to new_size.
And 0 is returned.

The size of the reallocated refcount array buffer
+ * will be aligned to a cluster boundary, and the newly allocated area will be
+ * zeroed.
+ */
+static int realloc_refcount_array(BDRVQcowState *s, uint16_t **array,
+                                  int64_t *size, int64_t new_size)
+{
+    /* Round to clusters so the array can be directly written to disk */
+    size_t old_byte_size = ROUND_UP(refcount_array_byte_size(s, *size),
+                                    s->cluster_size);
+    size_t new_byte_size = ROUND_UP(refcount_array_byte_size(s, new_size),
+                                    s->cluster_size);
size_to_clusters()? (Unfortunately still not short enough to keep each
initialisation on a single line...)

Right, I'll use it.

+    uint16_t *new_ptr;
+
+    if (new_byte_size == old_byte_size) {
+        *size = new_size;
+        return 0;
+    }
This is only correct if the array was previously allocated by the same
function, or at least rounded up to a cluster boundary. What do we win
by keeping a smaller byte count in *size than is actually allocated? If
we had the real allocation size in it, we wouldn't have to make
assumptions about the real array size.

Historical reasons. *cough*

Other than that, I don't see how assuming the array was previously allocated by this function is a bad thing. I'll gladly continue under that assumption. And by having the number of entries in *size instead of the byte size of the refcount array, this patch is kept shorter (I have no proof of that, though) because the size of that array was (before this patch and before this series) not kept in bytes, but in entries. This is why keeping that unit here makes sense; also I don't like having different units across function parameters (in this case, bytes and entries).

So the first answer to your question "What do we win by keeping a smaller byte count in *size than is actually allocated?" is "it isn't a byte count but an entry count". The answer to the (naturally?) ensuing question "So why do we keep a smaller entry count in *size than is actually allocated?" is "because there is no refcount_array_entry_count_to_byte_size() and I don't see what's so bad about this block that we have to introduce that function".

Thus, to me it looks trying to have the maximum number of entries stored in *size (or the byte size of the table) will result in more code, and that is what we win by not doing that.

As far as I can see from your response, what we're losing is that we have to assume that the refcount array is always allocated by this function; as I've said above, I'll gladly do so.

Max

+    assert(new_byte_size > 0);
+
+    new_ptr = g_try_realloc(*array, new_byte_size);
+    if (!new_ptr) {
+        return -ENOMEM;
+    }
+
+    if (new_byte_size > old_byte_size) {
+        memset((void *)((uintptr_t)new_ptr + old_byte_size), 0,
+               new_byte_size - old_byte_size);
+    }
+
+    *array = new_ptr;
+    *size  = new_size;
+
+    return 0;
+}
/*
   * Increases the refcount for a range of clusters in a given refcount table.
@@ -1146,6 +1210,7 @@ static int inc_refcounts(BlockDriverState *bs,
  {
      BDRVQcowState *s = bs->opaque;
      uint64_t start, last, cluster_offset, k;
+    int ret;
if (size <= 0) {
          return 0;
@@ -1157,23 +1222,12 @@ static int inc_refcounts(BlockDriverState *bs,
          cluster_offset += s->cluster_size) {
          k = cluster_offset >> s->cluster_bits;
          if (k >= *refcount_table_size) {
-            int64_t old_refcount_table_size = *refcount_table_size;
-            uint16_t *new_refcount_table;
-
-            *refcount_table_size = k + 1;
-            new_refcount_table = g_try_realloc(*refcount_table,
-                                               *refcount_table_size *
-                                               sizeof(**refcount_table));
-            if (!new_refcount_table) {
-                *refcount_table_size = old_refcount_table_size;
+            ret = realloc_refcount_array(s, refcount_table,
+                                         refcount_table_size, k + 1);
+            if (ret < 0) {
                  res->check_errors++;
-                return -ENOMEM;
+                return ret;
              }
-            *refcount_table = new_refcount_table;
-
-            memset(*refcount_table + old_refcount_table_size, 0,
-                   (*refcount_table_size - old_refcount_table_size) *
-                   sizeof(**refcount_table));
          }
if (++(*refcount_table)[k] == 0) {
@@ -1542,8 +1596,7 @@ static int check_refblocks(BlockDriverState *bs, 
BdrvCheckResult *res,
                      fix & BDRV_FIX_ERRORS ? "Repairing" : "ERROR", i);
if (fix & BDRV_FIX_ERRORS) {
-                int64_t old_nb_clusters = *nb_clusters;
-                uint16_t *new_refcount_table;
+                int64_t new_nb_clusters;
if (offset > INT64_MAX - s->cluster_size) {
                      ret = -EINVAL;
@@ -1560,22 +1613,15 @@ static int check_refblocks(BlockDriverState *bs, 
BdrvCheckResult *res,
                      goto resize_fail;
                  }
- *nb_clusters = size_to_clusters(s, size);
-                assert(*nb_clusters >= old_nb_clusters);
+                new_nb_clusters = size_to_clusters(s, size);
+                assert(new_nb_clusters >= *nb_clusters);
- new_refcount_table = g_try_realloc(*refcount_table,
-                                                   *nb_clusters *
-                                                   sizeof(**refcount_table));
-                if (!new_refcount_table) {
-                    *nb_clusters = old_nb_clusters;
+                ret = realloc_refcount_array(s, refcount_table,
+                                             nb_clusters, new_nb_clusters);
+                if (ret < 0) {
                      res->check_errors++;
-                    return -ENOMEM;
+                    return ret;
                  }
-                *refcount_table = new_refcount_table;
-
-                memset(*refcount_table + old_nb_clusters, 0,
-                       (*nb_clusters - old_nb_clusters) *
-                       sizeof(**refcount_table));
if (cluster >= *nb_clusters) {
                      ret = -EINVAL;
@@ -1635,10 +1681,12 @@ static int calculate_refcounts(BlockDriverState *bs, 
BdrvCheckResult *res,
      int ret;
if (!*refcount_table) {
-        *refcount_table = g_try_new0(uint16_t, *nb_clusters);
-        if (*nb_clusters && *refcount_table == NULL) {
+        int64_t old_size = 0;
+        ret = realloc_refcount_array(s, refcount_table,
+                                     &old_size, *nb_clusters);
Here the returned new size is thrown away.

With the implementation of realloc_refcount_array() as above this is not
incorrect because it is *nb_clusters anyway when the function succeeds,
but it's a bit sloppy.

If you decide to allow realloc_refcount_array() returning a size bigger
than was requested (i.e. the rounded value is returned) as I suggested
above, then you need to change this call.

+        if (ret < 0) {
              res->check_errors++;
-            return -ENOMEM;
+            return ret;
          }
      }
Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]