qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ovirt-users] Re: Any way to terminate stuck export task


From: Nir Soffer
Subject: Re: [ovirt-users] Re: Any way to terminate stuck export task
Date: Tue, 6 Jul 2021 15:52:09 +0300

On Tue, Jul 6, 2021 at 10:21 AM Gianluca Cecchi
<gianluca.cecchi@gmail.com> wrote:
>
> On Mon, Jul 5, 2021 at 5:06 PM Nir Soffer <nsoffer@redhat.com> wrote:
>
>>
>>
>> qemu-img is busy in posix_fallocate(), wiring one byte to every 4k block.
>>
>> If you add -tt -T (as I suggested), we can see how much time each write 
>> takes,
>> which may explain why this takes so much time.
>>
>>     strace -f -p 14342 --tt -T
>>
>
> It seems I missed part of your suggestion... i didn't get the "-tt -T" (or I 
> didn't see it...)
>
> With it I get this during the export (in networking of host console 4 
> mbit/s....):
>
> # strace -f -p 25243 -tt -T
> strace: Process 25243 attached with 2 threads
> [pid 25243] 09:17:32.503907 ppoll([{fd=9, events=POLLIN|POLLERR|POLLHUP}], 1, 
> NULL, NULL, 8 <unfinished ...>
> [pid 25244] 09:17:32.694207 pwrite64(12, "\0", 1, 3773509631) = 1 <0.000059>
> [pid 25244] 09:17:32.694412 pwrite64(12, "\0", 1, 3773513727) = 1 <0.000078>
> [pid 25244] 09:17:32.694608 pwrite64(12, "\0", 1, 3773517823) = 1 <0.000056>
> [pid 25244] 09:17:32.694729 pwrite64(12, "\0", 1, 3773521919) = 1 <0.000024>
> [pid 25244] 09:17:32.694796 pwrite64(12, "\0", 1, 3773526015) = 1 <0.000020>
> [pid 25244] 09:17:32.694855 pwrite64(12, "\0", 1, 3773530111) = 1 <0.000015>
> [pid 25244] 09:17:32.694908 pwrite64(12, "\0", 1, 3773534207) = 1 <0.000014>
> [pid 25244] 09:17:32.694950 pwrite64(12, "\0", 1, 3773538303) = 1 <0.000016>
> [pid 25244] 09:17:32.694993 pwrite64(12, "\0", 1, 3773542399) = 1 <0.200032>
> [pid 25244] 09:17:32.895140 pwrite64(12, "\0", 1, 3773546495) = 1 <0.000034>
> [pid 25244] 09:17:32.895227 pwrite64(12, "\0", 1, 3773550591) = 1 <0.000029>
> [pid 25244] 09:17:32.895296 pwrite64(12, "\0", 1, 3773554687) = 1 <0.000024>
> [pid 25244] 09:17:32.895353 pwrite64(12, "\0", 1, 3773558783) = 1 <0.000016>
> [pid 25244] 09:17:32.895400 pwrite64(12, "\0", 1, 3773562879) = 1 <0.000015>
> [pid 25244] 09:17:32.895443 pwrite64(12, "\0", 1, 3773566975) = 1 <0.000015>
> [pid 25244] 09:17:32.895485 pwrite64(12, "\0", 1, 3773571071) = 1 <0.000015>
> [pid 25244] 09:17:32.895527 pwrite64(12, "\0", 1, 3773575167) = 1 <0.000017>
> [pid 25244] 09:17:32.895570 pwrite64(12, "\0", 1, 3773579263) = 1 <0.199493>
> [pid 25244] 09:17:33.095147 pwrite64(12, "\0", 1, 3773583359) = 1 <0.000031>
> [pid 25244] 09:17:33.095262 pwrite64(12, "\0", 1, 3773587455) = 1 <0.000061>
> [pid 25244] 09:17:33.095378 pwrite64(12, "\0", 1, 3773591551) = 1 <0.000027>
> [pid 25244] 09:17:33.095445 pwrite64(12, "\0", 1, 3773595647) = 1 <0.000021>
> [pid 25244] 09:17:33.095498 pwrite64(12, "\0", 1, 3773599743) = 1 <0.000016>
> [pid 25244] 09:17:33.095542 pwrite64(12, "\0", 1, 3773603839) = 1 <0.000014>

Most writes are pretty fast, but from time to time there is a very slow write.

>From the small sample you posted, we have:

awk '{print $11}' strace.out | sed -e "s/<//" -e "s/>//" | awk
'{sum+=$1; if ($1 < 0.1) {fast+=$1; fast_nr++} else {slow+=$1;
slow_nr++}} END{printf "average: %.6f slow: %.6f fast: %.6f\n",
sum/NR, slow/slow_nr, fast/fast_nr}'
average: 0.016673 slow: 0.199763 fast: 0.000028

Preallocating a 300 GiB disk will take about 15 days :-)

>>> 300*1024**3 / 4096 * 0.016673 / 3600 / 24
15.17613511111111

If all writes would be fast, it will take less than an hour:

>>> 300*1024**3 / 4096 * 0.000028 / 3600
0.6116693333333333

> . . .
>
> BTW: it seems my NAS appliance doesn't support 4.2 version of NFS, because if 
> I force it, I then get an error in mount and in engine.log this error for 
> both nodes as they try to mount:
>
> 2021-07-05 17:01:56,082+02 ERROR 
> [org.ovirt.engine.core.bll.storage.connection.FileStorageHelper] 
> (EE-ManagedThreadFactory-engine-Thread-2554190) [642eb6be] The connection 
> with details '172.16.1.137:/nas/EXPORT-DOMAIN' failed because of error code 
> '477' and error message is: problem while trying to mount target
>
>
> and in vdsm.log:
> MountError: (32, ';mount.nfs: Protocol not supported\n')

Too bad.

You can evaluate how ovirt 4.4. will work with this appliance using
this dd command:

    dd if=/dev/zero bs=8M count=38400 of=/path/to/new/disk
oflag=direct conv=fsync

We don't use dd for this, but the operation is the same on NFS < 4.2.

Based on the 50 MiB/s rate you reported earlier, I guess you have a
1Gbit network to
this appliance, so zeroing can do up to 128 MiB/s, which will take
about 40 minutes
for 300G.

Using NFS 4.2, fallocate will complete in less than a second.

Here is example from my test system, creating 90g raw preallocated volume:

2021-07-06 15:46:40,382+0300 INFO  (tasks/1) [storage.Volume] Request
to create RAW volume /rhev/data-center/mnt/storage2:_exp
ort_00/a600ba04-34f9-4793-a5dc-6d4150716d14/images/bcf7c623-8fd8-47b3-aaee-a65c0872536d/82def38d-b41b-4126-826e-0513d669f1b5
with capacity = 96636764160 (fileVolume:493)
...
2021-07-06 15:46:40,447+0300 INFO  (tasks/1) [storage.Volume]
Preallocating volume
/rhev/data-center/mnt/storage2:_export_00/a600ba04-34f9-4793-a5dc-6d4150716d14/images/bcf7c623-8fd8-47b3-aaee-a65c0872536d/82def38d-b41b-4126-826e-0513d669f1b5:
0.05 seconds (utils:390)

Nir




reply via email to

[Prev in Thread] Current Thread [Next in Thread]