[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v0] Implement new cache mode "target"
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [PATCH v0] Implement new cache mode "target" |
Date: |
Thu, 15 Aug 2019 14:53:09 +0100 |
User-agent: |
Mutt/1.12.1 (2019-06-15) |
On Wed, Aug 07, 2019 at 04:09:54PM +0300, Artemy Kapitula wrote:
Hi,
Please use "scripts/get_maintainer.pl -f block.c" to find out which
maintainers to email. address@hidden is a high-traffic list and
patches not CCed to the right maintainer may not get quick review.
> There is an issue with databases in VM that perform too slow
> on generic SAN storages. The key point is fdatasync that flushes
> disk on SCSI target.
>
> The QEMU blockdev "target" cache mode intended to be used with
> SAN storages and is a mix of "none" by using direct I/O and
> "unsafe" that omit device flush.
>
> Such storages has its own data integrity protection and can
> be operated with direct I/O without additional fdatasyc().
>
> With generic SCSI targets like LIO or SCST it boost performance
> up to 100% on some profiles like database with transaction journal
> (postrgesql/mssql/oracle etc) or virtualized SDS (ceph/rook inside
> VMs) which performs block device cache flush on journal record.
If the physical storage controller has a Battery Backed Unit (BBU) or
similar then flush requests are not required with O_DIRECT. This has
been a common enterprise storage configuration for many years and is
already supported in QEMU today:
Configure the guest with cache=none and disable the emulated storage
controller's write cache (e.g. -device virtio-blk-pci,write-cache=off).
Inside the guest /sys/block/$BLKDEV/queue/write_cache should show "write
through".
I think this patch is not necessary since write-cache=off already
exists. cache=target is also slower since the guest sends unnecessary
flush requests to the emulated storage controller.
Thanks,
Stefan
> Signed-off-by: Artemy Kapitula <address@hidden>
> ---
> block.c | 4 ++++
> qemu-options.hx | 3 ++-
> tests/qemu-iotests/026 | 2 +-
> tests/qemu-iotests/091 | 2 +-
> 4 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/block.c b/block.c
> index cbd8da5f3b..60919d82ff 100644
> --- a/block.c
> +++ b/block.c
> @@ -884,6 +884,10 @@ int bdrv_parse_cache_mode(const char *mode, int *flags,
> bool *writethrough)
> } else if (!strcmp(mode, "unsafe")) {
> *writethrough = false;
> *flags |= BDRV_O_NO_FLUSH;
> + } else if (!strcmp(mode, "target")) {
> + *writethrough = false;
> + *flags |= BDRV_O_NOCACHE;
> + *flags |= BDRV_O_NO_FLUSH;
> } else if (!strcmp(mode, "writethrough")) {
> *writethrough = true;
> } else {
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 9621e934c0..01f1f4ad34 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1065,7 +1065,7 @@ This option defines the type of the media: disk or
> cdrom.
> @var{snapshot} is "on" or "off" and controls snapshot mode for the given
> drive
> (see @option{-snapshot}).
> @item cache=@var{cache}
> -@var{cache} is "none", "writeback", "unsafe", "directsync" or "writethrough"
> +@var{cache} is "none", "writeback", "unsafe", "target", "directsync" or
> "writethrough"
> and controls how the host cache is used to access block data. This is a
> shortcut that sets the @option{cache.direct} and @option{cache.no-flush}
> options (as in @option{-blockdev}), and additionally
> @option{cache.writeback},
> @@ -1084,6 +1084,7 @@ none │ on on off
> writethrough │ off off off
> directsync │ off on off
> unsafe │ on off on
> +target │ on on on
> @end example
> The default mode is @option{cache=writeback}.
> diff --git a/tests/qemu-iotests/026 b/tests/qemu-iotests/026
> index e30243608b..e7179b0de4 100755
> --- a/tests/qemu-iotests/026
> +++ b/tests/qemu-iotests/026
> @@ -42,7 +42,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
> _supported_fmt qcow2
> _supported_proto file
> _default_cache_mode "writethrough"
> -_supported_cache_modes "writethrough" "none"
> +_supported_cache_modes "writethrough" "none" "target"
> # The refcount table tests expect a certain minimum width for refcount
> entries
> # (so that the refcount table actually needs to grow); that minimum is 16
> bits,
> # being the default refcount entry width.
> diff --git a/tests/qemu-iotests/091 b/tests/qemu-iotests/091
> index d62ef18a02..2eaf258c8a 100755
> --- a/tests/qemu-iotests/091
> +++ b/tests/qemu-iotests/091
> @@ -47,7 +47,7 @@ _supported_fmt qcow2
> _supported_proto file
> _supported_os Linux
> _default_cache_mode "none"
> -_supported_cache_modes "writethrough" "none" "writeback"
> +_supported_cache_modes "writethrough" "none" "writeback" "target"
> size=1G
> --
> 2.21.0
>
>
>
signature.asc
Description: PGP signature