[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] rdma: Fix cleanup in error paths
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] [PATCH] rdma: Fix cleanup in error paths |
Date: |
Wed, 25 Mar 2015 10:39:37 +0000 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
> As part of commit e325b49a320b493cc5d69e263751ff716dc458fe,
> order in which resources are destroyed was changed for fixing
> a seg fault. Due to this change, CQ will never get destroyed as
> CQ should be destroyed after QP destruction. Seg fault is caused
> improper cleanup when connection fails. Fixing cleanup after
> connection failure and order in which resources are destroyed
> in qemu_rdma_cleanup() routine.
Do you have a test case that triggers the seg fault?
> Signed-off-by: Meghana Cheripady <address@hidden>
> Signed-off-by: Padmanabh Ratnakar <address@hidden>
> ---
> migration/rdma.c | 22 ++++++++--------------
> 1 files changed, 8 insertions(+), 14 deletions(-)
>
> diff --git a/migration/rdma.c b/migration/rdma.c
> index e6c3a67..77e3444 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -2194,6 +2194,10 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
> }
> }
>
> + if (rdma->qp) {
> + rdma_destroy_qp(rdma->cm_id);
> + rdma->qp = NULL;
> + }
OK, that makes sense, the manpage for rdma_destroy_qp says
'Users must destroy any QP associated with an rdma_cm_id before destroying the
ID.'
so it's probably good to have this early.
> if (rdma->cq) {
> ibv_destroy_cq(rdma->cq);
> rdma->cq = NULL;
> @@ -2206,18 +2210,14 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
> ibv_dealloc_pd(rdma->pd);
> rdma->pd = NULL;
> }
> - if (rdma->listen_id) {
> - rdma_destroy_id(rdma->listen_id);
> - rdma->listen_id = NULL;
> - }
> if (rdma->cm_id) {
> - if (rdma->qp) {
> - rdma_destroy_qp(rdma->cm_id);
> - rdma->qp = NULL;
> - }
> rdma_destroy_id(rdma->cm_id);
> rdma->cm_id = NULL;
> }
> + if (rdma->listen_id) {
> + rdma_destroy_id(rdma->listen_id);
> + rdma->listen_id = NULL;
> + }
Can you explain this reorder - why is the order of listen_id and cm_id
important?
is that a failure on the destination?
> if (rdma->channel) {
> rdma_destroy_event_channel(rdma->channel);
> rdma->channel = NULL;
> @@ -2309,8 +2309,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error
> **errp)
> if (ret) {
> perror("rdma_connect");
> ERROR(errp, "connecting to destination!");
> - rdma_destroy_id(rdma->cm_id);
> - rdma->cm_id = NULL;
> goto err_rdma_source_connect;
> }
>
> @@ -2319,8 +2317,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error
> **errp)
> perror("rdma_get_cm_event after rdma_connect");
> ERROR(errp, "connecting to destination!");
> rdma_ack_cm_event(cm_event);
> - rdma_destroy_id(rdma->cm_id);
> - rdma->cm_id = NULL;
> goto err_rdma_source_connect;
> }
>
> @@ -2328,8 +2324,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error
> **errp)
> perror("rdma_get_cm_event != EVENT_ESTABLISHED after rdma_connect");
> ERROR(errp, "connecting to destination!");
> rdma_ack_cm_event(cm_event);
> - rdma_destroy_id(rdma->cm_id);
> - rdma->cm_id = NULL;
> goto err_rdma_source_connect;
> }
> rdma->connected = true;
> --
> 1.7.1
Dave
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK