24.03.2020 12:50, Kevin Wolf wrote:
Am 24.03.2020 um 08:14 hat Philippe Mathieu-Daudé geschrieben:
On 3/24/20 4:05 AM, Mansour Ahmadi wrote:
Hi,
Nullness of needs to be checked here:
https://github.com/qemu/qemu/blob/c532b954d96f96d361ca31308f75f1b95bd4df76/block/commit.c#L221
pstrcpy(bs->exact_filename, sizeof(bs->exact_filename),...
Do you have a reproducer? It's not obvious to me how bs->backing could
be NULL here.
While it is done at 2 other locations:
https://github.com/qemu/qemu/blob/c532b954d96f96d361ca31308f75f1b95bd4df76/block/backup-top.c#L113
https://github.com/qemu/qemu/blob/c532b954d96f96d361ca31308f75f1b95bd4df76/block/mirror.c#L1477
Commit 18775ff3269 made the change for mirror, however its commit
message is terse and doesn't say anything about the scenario where it
would happen. We also didn't add a test case for it. I would have
expected that failure to add the backing file would immediately error
out and not try to refresh the filename first.
backup-top.c has the check from the beginning. I assume it just copied
it from mirror.
Vladimir, do you remember the details?
No :(
But I believe that "Backing may be zero after failed bdrv_attach_child in
bdrv_set_backing_hd, which leads to SIGSEGV." means real problem, probably
reproduced on some experiment branch, not on master.
Let's try most simple check:
apply the following:
=============================
diff --git a/include/block/block.h b/include/block/block.h
index e569a4d747..dc20b55075 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -298,6 +298,8 @@ BdrvChild *bdrv_open_child(const char *filename,
const BdrvChildRole *child_role,
bool allow_none, Error **errp);
BlockDriverState *bdrv_open_blockdev_ref(BlockdevRef *ref, Error **errp);
+void bdrv_set_backing_hd_ex(BlockDriverState *bs, BlockDriverState *backing_hd,
+ bool bdrv_attach_child_fail, Error **errp);
void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
Error **errp);
int bdrv_open_backing_file(BlockDriverState *bs, QDict *parent_options,
diff --git a/block.c b/block.c
index a2542c977b..21b6124d73 100644
--- a/block.c
+++ b/block.c
@@ -2743,8 +2743,8 @@ static bool bdrv_inherits_from_recursive(BlockDriverState
*child,
* Sets the backing file link of a BDS. A new reference is created; callers
* which don't need their own reference any more must call bdrv_unref().
*/
-void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
- Error **errp)
+void bdrv_set_backing_hd_ex(BlockDriverState *bs, BlockDriverState *backing_hd,
+ bool bdrv_attach_child_fail, Error **errp)
{
bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
bdrv_inherits_from_recursive(backing_hd, bs);
@@ -2766,8 +2766,14 @@ void bdrv_set_backing_hd(BlockDriverState *bs,
BlockDriverState *backing_hd,
goto out;
}
- bs->backing = bdrv_attach_child(bs, backing_hd, "backing", &child_backing,
- errp);
+ if (bdrv_attach_child_fail) {
+ bs->backing = NULL;
+ error_setg(errp, "Unpredicted failure :)");
+ } else {
+ bs->backing = bdrv_attach_child(bs, backing_hd, "backing",
&child_backing,
+ errp);
+ }
+
/* If backing_hd was already part of bs's backing chain, and
* inherits_from pointed recursively to bs then let's update it to
* point directly to bs (else it will become NULL). */
@@ -2779,6 +2785,12 @@ out:
bdrv_refresh_limits(bs, NULL);
}
+void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
+ Error **errp)
+{
+ bdrv_set_backing_hd_ex(bs, backing_hd, false, errp);
+}
/*
* Opens the backing file for a BlockDriverState if not yet open
*
diff --git a/block/mirror.c b/block/mirror.c
index 447051dbc6..907bb2b718 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -675,7 +675,7 @@ static int mirror_exit_common(Job *job)
if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
BlockDriverState *backing = s->is_none_mode ? src : s->base;
if (backing_bs(target_bs) != backing) {
- bdrv_set_backing_hd(target_bs, backing, &local_err);
+ bdrv_set_backing_hd_ex(target_bs, backing, true, &local_err);
if (local_err) {
error_report_err(local_err);
ret = -EPERM;
@@ -1477,6 +1477,7 @@ static void
bdrv_mirror_top_refresh_filename(BlockDriverState *bs)
if (bs->backing == NULL) {
/* we can be here after failed bdrv_attach_child in
* bdrv_set_backing_hd */
+ abort();
return;
}
pstrcpy(bs->exact_filename, sizeof(bs->exact_filename),
==============================================
[root@kvm qemu-iotests]# git grep -il mirror ??? | xargs
030 041 044 094 109 127 129 132 139 141 151 152 155 156 185 194 218 219 229 235
248 255 257 281
check -qcow2 030 041 044 094 109 127 129 132 139 141 151 152 155 156 185 194
218 219 229 235 248 255 257 281
...
Failures: 041 141 155
Have several core dumps, look at one:
Wow! These crashes are due to another bug: use after free!
Fix it:
diff --git a/block/mirror.c b/block/mirror.c
index 907bb2b718..22cf48e46a 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -678,6 +678,7 @@ static int mirror_exit_common(Job *job)
bdrv_set_backing_hd_ex(target_bs, backing, true, &local_err);
if (local_err) {
error_report_err(local_err);
+ local_err = NULL;
ret = -EPERM;
}
}
@@ -716,6 +717,7 @@ static int mirror_exit_common(Job *job)
bdrv_drained_end(target_bs);
if (local_err) {
error_report_err(local_err);
+ local_err = NULL;
ret = -EPERM;
}
}
Roll:
make -j9 && check -qcow2 030 041 044 094 109 127 129 132 139 141 151 152 155
156 185 194 218 219 229 235 248 255 257 281
Aha, new crashes! Let's look at them.
41 and 155 failed with crash, 141 without but I see "+{"error": {"class": "GenericError",
"desc": "Block device drv0 is in use"}}" in its output.
Hmm, but all crashes are because of assert(QTAILQ_EMPTY(&all_bdrv_states)); in
bdrv_close_all..
So, we have a problem, but another one..