qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert


From: Anton Nefedov
Subject: Re: [Qemu-devel] [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete
Date: Tue, 25 Apr 2017 16:27:46 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0



On 04/25/2017 12:55 PM, Peter Lieven wrote:
Am 24.04.2017 um 22:13 schrieb Anton Nefedov:

On 24/04/2017 21:16, Peter Lieven wrote:


Am 24.04.2017 um 18:27 schrieb Anton Nefedov
<address@hidden>:

On 04/21/2017 03:37 PM, Peter Lieven wrote:
Am 21.04.2017 um 14:19 schrieb Anton Nefedov:
On 04/21/2017 01:44 PM, Peter Lieven wrote:
Am 21.04.2017 um 12:04 schrieb Anton Nefedov:
On error path (like i/o error in one of the coroutines), it's
required to
 - wait for coroutines completion before cleaning the common
structures
 - reenter dependent coroutines so they ever finish

Introduced in 2d9187bc65.

Signed-off-by: Anton Nefedov <address@hidden>
---
[..]



And what if we error out in the read path? Wouldn't be something
like this easier?


diff --git a/qemu-img.c b/qemu-img.c
index 22f559a..4ff1085 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1903,6 +1903,16 @@ static int convert_do_copy(ImgConvertState
*s)
        main_loop_wait(false);
    }

+    /* on error path we need to enter all coroutines that are still
+     * running before cleaning up common structures */
+    if (s->ret) {
+        for (i = 0; i < s->num_coroutines; i++) {
+             if (s->co[i]) {
+                 qemu_coroutine_enter(s->co[i]);
+             }
+        }
+    }
+
    if (s->compressed && !s->ret) {
        /* signal EOF to align */
        ret = blk_pwrite_compressed(s->target, 0, NULL, 0);


Peter


seemed a bit too daring to me to re-enter every coroutine
potentially including the ones that yielded waiting for I/O
completion.
If that's ok - that is for sure easier :)

I think we should enter every coroutine that is still running and
have it terminate correctly. It was my mistake that I have not
done this in the original patch. Can you check if the above fixes
your test cases that triggered the bug?


hi, sorry I'm late with the answer

this segfaults in bdrv_close(). Looks like it tries to finish some
i/o which coroutine we have already entered and terminated?

(gdb) run
Starting program: /vz/anefedov/qemu-build/us/./qemu-img convert -O
[..]


Peter


/Anton


it seems that this is a bit tricky, can you share how your test case
works?

thanks,
peter


how I tested today was basically

diff --git a/qemu-img.c b/qemu-img.c
index 4425aaa..3d2d506 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1788,6 +1788,10 @@ static void coroutine_fn
convert_co_do_copy(void *opaque)

         if (status == BLK_DATA) {
             ret = convert_co_read(s, sector_num, n, buf);
+            const uint64_t fsector = 10*1024*1024/512;
+            if (sector_num <= fsector && fsector < sector_num+n) {
+                ret = -EIO;
+            }
             if (ret < 0) {
                 error_report("error while reading sector %" PRId64
                              ": %s", sector_num, strerror(-ret));


I ended up with sth similar to your approach. Can you check this?


Thanks,

Peter


diff --git a/qemu-img.c b/qemu-img.c
index b94fc11..ed9ce9e 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1761,13 +1761,13 @@ static void coroutine_fn convert_co_do_copy(void
*opaque)
         qemu_co_mutex_lock(&s->lock);
         if (s->ret != -EINPROGRESS || s->sector_num >= s->total_sectors) {
             qemu_co_mutex_unlock(&s->lock);
-            goto out;
+            break;
         }
         n = convert_iteration_sectors(s, s->sector_num);
         if (n < 0) {
             qemu_co_mutex_unlock(&s->lock);
             s->ret = n;
-            goto out;
+            break;
         }
         /* save current sector and allocation status to local variables */
         sector_num = s->sector_num;
@@ -1792,7 +1792,6 @@ static void coroutine_fn convert_co_do_copy(void
*opaque)
                 error_report("error while reading sector %" PRId64
                              ": %s", sector_num, strerror(-ret));
                 s->ret = ret;
-                goto out;
             }
         } else if (!s->min_sparse && status == BLK_ZERO) {
             status = BLK_DATA;
@@ -1801,22 +1800,20 @@ static void coroutine_fn convert_co_do_copy(void
*opaque)

         if (s->wr_in_order) {
             /* keep writes in order */
-            while (s->wr_offs != sector_num) {
-                if (s->ret != -EINPROGRESS) {
-                    goto out;
-                }
+            while (s->wr_offs != sector_num && s->ret == -EINPROGRESS) {
                 s->wait_sector_num[index] = sector_num;
                 qemu_coroutine_yield();
             }
             s->wait_sector_num[index] = -1;
         }

-        ret = convert_co_write(s, sector_num, n, buf, status);
-        if (ret < 0) {
-            error_report("error while writing sector %" PRId64
-                         ": %s", sector_num, strerror(-ret));
-            s->ret = ret;
-            goto out;
+        if (s->ret == -EINPROGRESS) {
+            ret = convert_co_write(s, sector_num, n, buf, status);
+            if (ret < 0) {
+                error_report("error while writing sector %" PRId64
+                             ": %s", sector_num, strerror(-ret));
+                s->ret = ret;
+            }
         }

         if (s->wr_in_order) {
@@ -1837,7 +1834,6 @@ static void coroutine_fn convert_co_do_copy(void
*opaque)
         }
     }

-out:
     qemu_vfree(buf);
     s->co[index] = NULL;
     s->running_coroutines--;
@@ -1899,7 +1895,7 @@ static int convert_do_copy(ImgConvertState *s)
         qemu_coroutine_enter(s->co[i]);
     }

-    while (s->ret == -EINPROGRESS) {
+    while (s->running_coroutines) {
         main_loop_wait(false);
     }



This one seems to work; I've been running for several hours qemu-img convert from nbd which spontaneously fails in ~ every 2nd convert; and yet qemu-img keeps terminating smoothly

/Anton



reply via email to

[Prev in Thread] Current Thread [Next in Thread]