qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH 02/10] Add buffered_file_internal constant


From: Juan Quintela
Subject: [Qemu-devel] Re: [PATCH 02/10] Add buffered_file_internal constant
Date: Tue, 30 Nov 2010 12:56:15 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux)

Anthony Liguori <address@hidden> wrote:
> On 11/24/2010 04:52 AM, Juan Quintela wrote:
>> "Michael S. Tsirkin"<address@hidden>  wrote:
>>    
>>> On Wed, Nov 24, 2010 at 12:02:59AM +0100, Juan Quintela wrote:
>>>      
>>>> From: Juan Quintela<address@hidden>
>>>>        
>>    
>>>> diff --git a/buffered_file.h b/buffered_file.h
>>>> index 98d358b..a728316 100644
>>>> --- a/buffered_file.h
>>>> +++ b/buffered_file.h
>>>> @@ -21,6 +21,8 @@ typedef void (BufferedPutReadyFunc)(void *opaque);
>>>>   typedef void (BufferedWaitForUnfreezeFunc)(void *opaque);
>>>>   typedef int (BufferedCloseFunc)(void *opaque);
>>>>
>>>> +extern const int buffered_file_interval;
>>>> +
>>>>        
>>> This shouldn't be exported, IMO.
>>>      
>> What do you want?  an accessor function?  Notice that it is a constant.
>> We need the value in other places, see the last patch.
>>    
>
> That one's just as wrong as this one.  TBH, this whole series is
> fundamentally the wrong approach because it's all ad hoc heuristics
> benchmarked against one workload.

No, I benchmarked against two workloads:
a- idle guest (because it was faster to test)
b- busy guest (each test takes forever, that is the reason that I tested
last).

So, I don't agree with that.

> There are three fundamental problems: 1) kvm.ko dirty bit tracking
> doesn't scale

Fully agree, but this patch don't took that.

> 2) we lose flow control information because of the
> multiple levels of buffering which means we move more data than we
> should move

Fully agree here, but this is a "massive change" to fix it correctly.

> 3) migration prevents a guest from executing the device
> model because of qemu_mutex.

This is a different problem.

> Those are the problems to fix.

This still don't fix the stalls on the main_loop.

So, you are telling me, there are this list of problems that you need to
fix.  They are not enough to fix the problem, and their imply massive
changes.

In the middle time, everybody in stable and 0.14 is not going to be able
to use migration with more than 2GB/4GB guest.

>  Sprinkling the code with returns in
> semi-random places because it benchmarked well for one particular test
> case is something we'll deeply regret down the road.

This was mean :(

There are two returns and one heuristic.

- return a) we try to migrate when we know that there is no space,
  obvious optimazation/bug (deppends on how to look at it).

- return b) we don't need to handle TLB bitmap code for kvm.  I fully
  agree that we need to split the bitmaps in something more sensible,
  but change is quite invasible, and simple fix works for the while.

- heuristic:  if you really think that an io_handler should be able to
  stall the main loop for almost 4 seconds, sorry, I don't agree.
  Again, we can fix this much better, but it needs lots of changes:
   * I asked you if there is a "maximum" value that one io_handler can
     hog the main loop.  Answer: dunno/deppends.
   * Long term: migration should be its own thread, have I told thread?
     This is qemu, no thread.
   * qemu_mutex: this is the holy grail, if we drop qemu_mutex in
     ram_save_live(), we left the guest cpu's to continue running.
     But, and this is a big but, we still stuck the main_loop for 4
     seconds, so this solution is at least partial.

And that is it.  To improve things, I receive complains left and right
that exporting buffered_file_period/timeout is BAD, very BAD.  Because
that exposes buffered_file.c internal state.

But I am given the suggestion that I should just create a
buffered_file_write_nop() that just increases the number of bytes
transferred for normal pages.  I agree that this "could" also work, but
that is indeed worse in the sense that we are exposing yet more
internals of buffered_file.c.  In the 1st case, we are only exposing the
periof of a timer, in the second, we are hijaking how
qemu_file_rate_limit() works + how we account for written memory + how
it calculates the ammount of staff that it have sent, and with this we
"knownly" lie about how much stuff we have write.

How this can be "cleaner" than a timeout of 50ms is beyond me.

On the other hand, this is the typical case where you need a lot of
testing for each change that you did.  I thought several times that I
had found the "guilty" bit for the stalls, and no luck, there was yet
another problem.

I also thought at points, ok, I can now drop the previous attempts, and
no, that didn't work either.  This was the minimal set of patches able
to fix the stalls.

This is just very depressing.

Later, Juan.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]