[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection
From: |
Marc-André Lureau |
Subject: |
Re: [PATCH 3/4] char-socket: avoid double call tcp_chr_free_connection |
Date: |
Mon, 20 Apr 2020 16:28:05 +0200 |
Hi
On Mon, Apr 20, 2020 at 5:10 AM Li Feng <address@hidden> wrote:
>
> Hi Lureau,
>
> Thanks for your comment.
> I have spent some time to reproduce this crash, so that delay my
> reply, apologies.
>
> This is the description of this bug using gdb:
> 1. Set break points using gdb;
> b chardev/char-socket.c:410 if s->state == TCP_CHARDEV_STATE_DISCONNECTED
> b break vhost_user_write
>
> 2. hotplug a vhost-blk device:
> echo "chardev-add
> socket,id=spdk_vhost_blk2,path=/vhost-blk.0,reconnect=1 "| nc -U
> /tmp/a.socket
> echo "device_add
> vhost-user-blk-pci,chardev=spdk_vhost_blk2,id=spdk_vhost_blk2,num-queues=4"
> | nc -U /tmp/a.socket
>
> 3. Gdb will stop at vhost_user_write, and CTRL-C the vhost target.
> 4. Put 'c' to let gdb continue.
> You will see a crash:
>
> 192 login: qemu-system-x86_64: chardev/char-socket.c:125:
> qemu_chr_socket_restart_timer: Assertion `!s->reconnect_timer' failed.
>
> The related code:
> 394 static void tcp_chr_free_connection(Chardev *chr)
> 395 {
> 396 SocketChardev *s = SOCKET_CHARDEV(chr);
> 397 int i;
> 398
> 399 if (s->read_msgfds_num) {
> 400 for (i = 0; i < s->read_msgfds_num; i++) {
> 401 close(s->read_msgfds[i]);
> 402 }
> 403 g_free(s->read_msgfds);
> 404 s->read_msgfds = NULL;
> 405 s->read_msgfds_num = 0;
> 406 }
> 407
> 408 remove_hup_source(s);
> 409
> 410 tcp_set_msgfds(chr, NULL, 0);
> 411 remove_fd_in_watch(chr);
> 412 object_unref(OBJECT(s->sioc));
> 413 s->sioc = NULL;
> 414 object_unref(OBJECT(s->ioc));
> 415 s->ioc = NULL;
> 416 g_free(chr->filename);
> 417 chr->filename = NULL;
> 418 tcp_chr_change_state(s, TCP_CHARDEV_STATE_DISCONNECTED);
> 419 }
>
> #0 0x0000555555c1aae8 in tcp_chr_free_connection
> (chr=chr@entry=0x55555751ee00) at chardev/char-socket.c:410
> #1 0x0000555555c1aec8 in tcp_chr_disconnect_locked
> (chr=0x55555751ee00) at chardev/char-socket.c:479
> #2 0x0000555555c1af5d in tcp_chr_disconnect (chr=0x55555751ee00) at
> chardev/char-socket.c:497
> #3 0x0000555555c16715 in qemu_chr_fe_set_handlers
> (b=b@entry=0x5555575f3b48, fd_can_read=fd_can_read@entry=0x0,
> fd_read=fd_
> read@entry=0x0, fd_event=fd_event@entry=0x55555588e740
> <vhost_user_blk_event>, be_change=be_change@entry=0x0, opaque=opaque@
> entry=0x5555575f3990, context=0x0, set_open=true) at chardev/char-fe.c:304
> #4 0x000055555588e5c0 in vhost_user_blk_device_realize
> (dev=0x5555575f3990, errp=<optimized out>)
> at /root/qemu-master/hw/block/vhost-user-blk.c:447
> #5 0x00005555558ca478 in virtio_device_realize (dev=0x5555575f3990,
> errp=0x7fffffffbb90)
> at /root/qemu-master/hw/virtio/virtio.c:3615
> #6 0x00005555559dc842 in device_set_realized (obj=<optimized out>,
> value=<optimized out>, errp=0x7fffffffbd20)
> at hw/core/qdev.c:891
> #7 0x0000555555b78e07 in property_set_bool (obj=0x5555575f3990,
> v=<optimized out>, name=<optimized out>, opaque=0x555556453
> 040, errp=0x7fffffffbd20) at qom/object.c:2238
> #8 0x0000555555b7db3f in object_property_set_qobject
> (obj=obj@entry=0x5555575f3990, value=value@entry=0x555556ab89c0, name=
> name@entry=0x555555d15308 "realized", errp=errp@entry=0x7fffffffbd20)
> at qom/qom-qobject.c:26
> #9 0x0000555555b7b2c5 in object_property_set_bool
> (obj=0x5555575f3990, value=<optimized out>, name=0x555555d15308
> "realized
> ", errp=0x7fffffffbd20) at qom/object.c:1390
> #10 0x0000555555af13b2 in virtio_pci_realize (pci_dev=0x5555575eb800,
> errp=0x7fffffffbd20) at hw/virtio/virtio-pci.c:1807
> #11 0x0000555555a7a14b in pci_qdev_realize (qdev=<optimized out>,
> errp=<optimized out>) at hw/pci/pci.c:2098
> #12 0x00005555559dc842 in device_set_realized (obj=<optimized out>,
> value=<optimized out>, errp=0x7fffffffbef8)
> at hw/core/qdev.c:891
>
> (gdb) p s
> $23 = (SocketChardev *) 0x55555751ee00
> (gdb) p s->state
> $24 = TCP_CHARDEV_STATE_DISCONNECTED
>
> At 410 line of char-socket.c, the state has been changed to
> TCP_CHARDEV_STATE_DISCONNECTED before tcp_chr_change_state(s,
> TCP_CHARDEV_STATE_DISCONNECTED);
> It means the tcp_chr_disconnect_locked is called by more than one
> times, and the tcp socket has been freed before.
>
> The crash reason is s->reconnect_timer is set in the previous call of
> tcp_chr_disconnect_locked.
> The another approach fix maybe like this:
>
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index 185fe38dda..94957de367 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -486,7 +486,7 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> if (emit_close) {
> qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> }
> - if (s->reconnect_time) {
> + if (s->reconnect_time && !s->reconnect_timer) {
> qemu_chr_socket_restart_timer(chr);
> }
> }
>
> The base code is here:
> 474 static void tcp_chr_disconnect_locked(Chardev *chr)
> 475 {
> 476 SocketChardev *s = SOCKET_CHARDEV(chr);
> 477 bool emit_close = s->state == TCP_CHARDEV_STATE_CONNECTED;
> 478
> 479 tcp_chr_free_connection(chr);
> 480
> 481 if (s->listener) {
> 482 qio_net_listener_set_client_func_full(s->listener, tcp_chr_accept,
> 483 chr, NULL, chr->gcontext);
> 484 }
> 485 update_disconnected_filename(s);
> 486 if (emit_close) {
> 487 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
> 488 }
> 489 if (s->reconnect_time) {
> 490 qemu_chr_socket_restart_timer(chr);
> 491 }
> 492 }
>
> Which one is better?
+ if (s->reconnect_time && !s->reconnect_timer)
Looks like a good solution to me.
>
> I don't know how to inject an error at socket write when writing tests
> code(tests/test-char.c ).
> Any tips?
You could check the patch I just sent fixing a related issue:
https://patchew.org/QEMU/address@hidden/
In your case, it might be as easy as calling qemu_chr_fe_disconnect()
2 times on a reconnect socket, I'll let you figure out.
Thanks!
>
> Thanks,
> Feng Li
>
> Marc-André Lureau <address@hidden> 于2020年4月15日周三 下午6:35写道:
>
> >
> > Hi
> >
> > On Wed, Apr 15, 2020 at 5:31 AM Li Feng <address@hidden> wrote:
> > >
> > > double call tcp_chr_free_connection generates a crash.
> > >
> > > Signed-off-by: Li Feng <address@hidden>
> >
> > Could you describe how you reach the "double call".
> >
> > Even better would be to write a test for it in tests/test-char.c
> >
> > thanks
> >
> > > ---
> > > chardev/char-socket.c | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > index 185fe38dda..43aab8f24b 100644
> > > --- a/chardev/char-socket.c
> > > +++ b/chardev/char-socket.c
> > > @@ -476,6 +476,11 @@ static void tcp_chr_disconnect_locked(Chardev *chr)
> > > SocketChardev *s = SOCKET_CHARDEV(chr);
> > > bool emit_close = s->state == TCP_CHARDEV_STATE_CONNECTED;
> > >
> > > + /* avoid re-enter when socket read/write error and disconnect event.
> > > */
> > > + if (s->state == TCP_CHARDEV_STATE_DISCONNECTED) {
> > > + return;
> > > + }
> > > +
> > > tcp_chr_free_connection(chr);
> > >
> > > if (s->listener) {
> > > --
> > > 2.11.0
> > >
> > >
> > > --
> > > The SmartX email address is only for business purpose. Any sent message
> > > that is not related to the business is not authorized or permitted by
> > > SmartX.
> > > 本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
> > >
> > >
> > >
> >
> >
> > --
> > Marc-André Lureau
>
> --
> The SmartX email address is only for business purpose. Any sent message
> that is not related to the business is not authorized or permitted by
> SmartX.
> 本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
>
>
--
Marc-André Lureau
[PATCH v2] char-socket: initialize reconnect timer only when the timer doesn't start, Li Feng, 2020/04/28
[PATCH 4/4] vhost-user-blk: fix crash in realize process, Li Feng, 2020/04/14
[PATCH 2/4] vhost-user-blk: fix invalid memory access, Li Feng, 2020/04/14