|
From: | Bjoern Teipel |
Subject: | virtio-scsi and block mirroring |
Date: | Wed, 20 Apr 2022 17:29:34 +0000 |
Hello everyone, I’m looking at an issue where I do see guests freezing (Dl) process state during a block disk mirror from one storage to another storage (NFS) where the network stack of the guest can freeze for up to 10 seconds. Looking at the storage and IO I noticed good throughput ad low latency <3ms and I am having trouble to track down the source for the issue, as neither storage nor networking show issues. Interestingly when I do
the same test with virtio-blk I do not really see the process freezes at the frequency or duration compared to virtio-scsi which seem to indicate a client side rather than storage side problem. The copy job is setup by openstack volume migration and translate into <mirror type='file' file='/var/lib/nova/mnt/xxx' format='raw' job='copy'> <format type='raw'/> <source file='/var/lib/nova/mnt/yyy' index='4'/> <backingStore/> </mirror> From what I observed the issue is more noticeable when I see more fdatasync calls during the copy but I haven’t been able to correlate that to the issue 100% yet % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 28.51 20.672654 8339 2479 ioctl 27.81 20.162714 3379 5967 31 futex 22.02 15.964498 785 20335 poll 15.22 11.038403 150 73561 io_submit 4.17 3.023285 41 73540 lseek 1.20 0.868003 5 158591 write 0.63 0.459030 11 42871 ppoll 0.22 0.159263 8 19314 recvmsg 0.16 0.115520 5 22526 read 0.04 0.029149 29149 1 restart_syscall 0.01 0.009252 28 330 sendmsg 0.00 0.001221 1221 1 munmap 0.00 0.000458 22 21 fcntl
0.00 0.000286 95 3 openat 0.00 0.000166 5 32 rt_sigprocmask 0.00 0.000103 10 10 fdatasync 0.00 0.000099 25 4
clone 0.00 0.000081 7 12 mmap 0.00 0.000077 19 4 close 0.00 0.000076 6 12 mprotect 0.00 0.000056 14 4 madvise 0.00 0.000025 6 4 set_robust_list 0.00 0.000023 6 4 prctl ------ ----------- ----------- --------- --------- ---------------- 100.00 72.504442 419626 31 total Does anyone have an idea how to better debug this issue ? Thanks Bjoern |
[Prev in Thread] | Current Thread | [Next in Thread] |