qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Virtio-fs] virtio-fs performance


From: Vivek Goyal
Subject: Re: [Virtio-fs] virtio-fs performance
Date: Tue, 28 Jul 2020 11:27:19 -0400

On Tue, Jul 28, 2020 at 02:49:36PM +0100, Stefan Hajnoczi wrote:
> > I'm trying and testing the virtio-fs feature in QEMU v5.0.0.
> > My host and guest OS are both ubuntu 18.04 with kernel 5.4, and the
> > underlying storage is one single SSD.
> > 
> > The configuations are:
> > (1) virtiofsd
> > ./virtiofsd -o 
> > source=/mnt/ssd/virtiofs,cache=auto,flock,posix_lock,writeback,xattr
> > --thread-pool-size=1 --socket-path=/tmp/vhostqemu
> > 
> > (2) qemu
> > qemu-system-x86_64 \
> > -enable-kvm \
> > -name ubuntu \
> > -cpu Westmere \
> > -m 4096 \
> > -global kvm-apic.vapic=false \
> > -netdev 
> > tap,id=hn0,vhost=off,br=br0,helper=/usr/local/libexec/qemu-bridge-helper
> > \
> > -device e1000,id=e0,netdev=hn0 \
> > -blockdev '{"node-name": "disk0", "driver": "qcow2",
> > "refcount-cache-size": 1638400, "l2-cache-size": 6553600, "file": {
> > "driver": "file", "filename": "'${imagefolder}\/ubuntu.qcow2'"}}' \
> > -device virtio-blk,drive=disk0,id=disk0 \
> > -chardev socket,id=ch0,path=/tmp/vhostqemu \
> > -device vhost-user-fs-pci,chardev=ch0,tag=myfs \
> > -object memory-backend-memfd,id=mem,size=4G,share=on \
> > -numa node,memdev=mem \
> > -qmp stdio \
> > -vnc :0
> > 
> > (3) guest
> > mount -t virtiofs myfs /mnt/virtiofs
> > 
> > I tried to change virtiofsd's --thread-pool-size value and test the
> > storage performance by fio.
> > Before each read/write/randread/randwrite test, the pagecaches of
> > guest and host are dropped.
> > 
> > ```
> > RW="read" # or write/randread/randwrite
> > fio --name=test --rw=$RW --bs=4k --numjobs=1 --ioengine=libaio
> > --runtime=60 --direct=0 --iodepth=64 --size=10g
> > --filename=/mnt/virtiofs/testfile
> > done

Couple of things.

- Can you try cache=none option in virtiofsd. That will bypass page
  cache in guest. It also gets rid of latencies related to
  file_remove_privs() as of now. 

- Also with direct=0, are we really driving iodepth of 64? With direct=0
  it is cached I/O. Is it still asynchronous at this point of time of
  we have fallen back to synchronous I/O and driving queue depth of
  1.

- With cache=auto/always, I am seeing performance issues with small writes
  and trying to address it.

https://lore.kernel.org/linux-fsdevel/20200716144032.GC422759@redhat.com/
https://lore.kernel.org/linux-fsdevel/20200724183812.19573-1-vgoyal@redhat.com/

Thanks
Vivek

> > ```
> > 
> > --thread-pool-size=64 (default)
> >     seq read: 305 MB/s
> >     seq write: 118 MB/s
> >     rand 4KB read: 2222 IOPS
> >     rand 4KB write: 21100 IOPS
> > 
> > --thread-pool-size=1
> >     seq read: 387 MB/s
> >     seq write: 160 MB/s
> >     rand 4KB read: 2622 IOPS
> >     rand 4KB write: 30400 IOPS
> > 
> > The results show the performance using default-pool-size (64) is
> > poorer than using single thread.
> > Is it due to the lock contention of the multiple threads?
> > When can virtio-fs get better performance using multiple threads?
> > 
> > 
> > I also tested the performance that guest accesses host's files via
> > NFSv4/CIFS network filesystem.
> > The "seq read" and "randread" performance of virtio-fs are also worse
> > than the NFSv4 and CIFS.
> > 
> > NFSv4:
> >   seq write: 244 MB/s
> >   rand 4K read: 4086 IOPS
> > 
> > I cannot figure out why the perf of NFSv4/CIFS with the network stack
> > is better than virtio-fs.
> > Is it expected? Or, do I have an incorrect configuration?
> 
> No, I remember benchmarking the thread pool and did not see such a big
> difference.
> 
> Please use direct=1 so that each I/O results in a virtio-fs request.
> Otherwise the I/O pattern is not directly controlled by the benchmark
> but by the page cache (readahead, etc).
> 
> Using numactl(8) or taskset(1) to launch virtiofsd allows you to control
> NUMA and CPU scheduling properties. For example, you could force all 64
> threads to run on the same host CPU using taskset to see if that helps
> this I/O bound workload.
> 
> fio can collect detailed statistics on queue depths and a latency
> histogram. It would be interesting to compare the --thread-pool-size=64
> and --thread-pool-size=1 numbers.
> 
> Comparing the "perf record -e kvm:kvm_exit" counts between the two might
> also be interesting.
> 
> Stefan



> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs




reply via email to

[Prev in Thread] Current Thread [Next in Thread]