qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Block layer complexity: what to do to keep it under con


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] Block layer complexity: what to do to keep it under control?
Date: Wed, 29 Nov 2017 12:16:11 +0000
User-agent: Mutt/1.9.1 (2017-09-22)

On Wed, Nov 29, 2017 at 01:30:06AM -0500, Jeff Cody wrote:
> On Wed, Nov 29, 2017 at 11:55:02AM +0800, Fam Zheng wrote:
> > Hi all,
> > 
> > As we move forwards with new features in the block layer, the chances of 
> > tricky
> > bugs happening have been increasing alongside - block jobs, coroutines,
> > throttling, AioContext, op blockers and image locking combined together 
> > make a
> > large and complex picture that is hard to fully understand and work with. 
> > Some
> > bugs we've encountered are quite challenging already.  Examples are:
> > 
> > - segfault in parallel blockjobs (iotest 30)
> >   https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01144.html
> > 
> > - Intermittent hang of iotest 194 (bdrv_drain_all after non-shared storage
> >   migration)
> >   https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01626.html
> > 
> > - Drainage in bdrv_replace_child_noperm()
> >   https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg00868.html
> > 
> > - Regression from 2.8: stuck in bdrv_drain()
> >   https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg02193.html
> > 
> 
> I agree, it seems the complexity is growing by quite a bit.
> 
> > So in principle, what should we do to make the block layer easy to 
> > understand,
> > develop with and debug? I think we have opportunities in these aspects:
> > 
> > - Documentation
> > 
> >   There is no central developer doc about block layer, especially how all 
> > pieces
> >   fit together. Having one will make it a lot easier for new contributors to
> >   understand better. Of course, we're facing the old problem: the code is
> >   moving, maintaining an updated document needs effort.
> > 
> >   Idea: add ./doc/deve/block.txt?
> > 
> 
> There are some bits of brilliance in what is already there; for instance,
> devel/atomics.txt is very thorough.  But I agree that a major piece missing
> is an overall design document, that provides the "why" to the "what".

While the atomics documentation is good, atomics themselves have been a
source of difficult bugs.

They should be used as little as possible and only where they can be
encapsulated in a composable abstraction (i.e. don't expect users of
your abstraction to understand atomics).

Why?  They are damn hard to use.  None of us is capable of using them
without introducing difficult bugs.

There is also a temptation to rely on implicit effects of other code
(e.g. when you know there is a barrier in another function) for best
performance.  That's a bad property for code to have because it becomes
hard to change safely.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]