qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: fdatasync semantics and block device backup


From: Kevin Wolf
Subject: Re: fdatasync semantics and block device backup
Date: Tue, 28 Apr 2020 13:11:24 +0200

Hi Bryan,

first of all, for your next question, please don't reply to a message in
an unrelated thread, but start a new email. This will give you a lot
more visibility because people generally use a threaded email view and
will decide whether to read an email or not depending on whether the
topic of that thread is interesting to them.

Am 27.04.2020 um 21:49 hat Bryan S Rosenburg geschrieben:
> Blockdev community,
> 
> Our group would like to write block device backups directly to an object 
> store, using an interface such as s3fs or rclone-mount. We've run into 
> problems with both interfaces, and in both cases the problems revolve 
> around fdatasync system calls. With s3fs, fdatasync calls are painfully 
> slow. With rclone-mount, the calls are very fast but don't do anything.
> 
> Syncing files to an object store is inherently problematic, as a proper 
> sync requires finalizing the object that holds the file. After 
> finalization, additional writes to the file require a new object to be 
> created and the old object to be copied and destroyed. This process 
> results in an N-squared performance problem for files that are synced 
> periodically as they are written, as is the case for qemu backups.
> 
> Empirically, s3fs implements fdatasync, and hence backups written to s3fs 
> take an untenably long time. I can provide data and straces, if needed.
> 
> Backups written to rclone-mount are much faster, but there are obvious 
> semantic problems. The backup job completes successfully before the file 
> is actually stable in the object store. And in fact, a lot of the work of 
> finalizing the file occurs during the "close" system call that is invoked 
> as part of the qmp_blockdev_del operation.The syscall causes that 
> operation to take so long that other commands time out waiting to "acquire 
> state change lock (held by monitor qemuProcessEventHandler)".
> 
> My questions for the group are: Has anyone else tried writing backups to 
> file systems that don't have good support for fdatasync, and do you have 
> any advice other than "Don't do that." ?

I think "don't do that" is a good answer actually.

You may want to put an NBD indirection between QEMU and your object
store, so that the close() syscall will just block a qemu-nbd process
that has already closed its connection to QEMU instead of blocking all
of QEMU.

It is possible to disable fdatasync() by specifying cache=unsafe for
the block device, so you could avoid the penalty of repeated syncs on
s3fs.

Of course, if s3fs requires an fsync before data is actually stable, in
this case you couldn't consider your backup completed when the backup
block job finishes successfully, but you would have to issue an fsync
manually and wait for its result before you can consider the backup
successful.

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]