qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v6 20/26] qapi: Fix output visitor to return qnu


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v6 20/26] qapi: Fix output visitor to return qnull() instead of NULL
Date: Sat, 12 Sep 2015 16:11:49 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Eric Blake <address@hidden> writes:

> On 09/12/2015 02:10 AM, Markus Armbruster wrote:
>
>> Relatively harmless, because the qnull_ singleton is static.  Worth
>> fixing anyway, of course.
>> 
>>>> I'm still investigating, and may be able to find the patch
>>>
>>> Squash this in, and you can have:
>>> Reviewed-by: Eric Blake <address@hidden>
>
> Well, your further questions are spot on; my squash proposal isn't quite
> right.
>
>
>> I put the qnull() in qmp_output_first() to avoid (QObject *)0 entirely.
>> Also because that's where I found the FIXME :)
>> 
>> You lift it into one of two callers.  Impact:
>> 
>> * qmp_output_get_qobject()
>> 
>>   - master: return NULL when !e || !e->value
>> 
>>   - my patch: return qnull() when !e
>>               return NULL when !e->value
>> 
>>   - your patch: return qnull() when !e || !e->value
>
> The leak in your patch was that qnull() increments the count, then
> qmp_output_get_object() also increments the count.
>
> I guess I'll have to investigate where e->value can be set to NULL to
> see if my idea was safe, or if we'd have to do something even different.
>
> If this were the only caller, then I guess we could always do something
> like this in qmp_output_first(), leaving the possibility of returning
> NULL for e->value:
>
> if (!e) {
>     obj = qnull();
>     qobject_decref(obj); /* Caller will again increment refcount */
>     return obj;
> }
>
> But it's not the only caller.
>
>> 
>> * qmp_output_visitor_cleanup()
>> 
>>   - master: root = NULL when when !e || !e->value
>> 
>>   - my patch: root = qnull() when !e
>>               root = NULL when !e->value
>> 
>>   - your patch: root = NULL when when !e || !e->value
>
> And calls qobject_decref(root), but that is safe on NULL.
>
> Here, your patch ends up with a net 0 refcnt on qnull() (incremented in
> qmp_output_first(), decremented in the cleanup), but my idea above to
> pre-decrement would be wrong.
>
> Another option would be to keep your patch to qmp_output_first(), but
> then fix qmp_output_get_object() to special case it it has an instance
> of QNull (no refcnt change) vs. anything else (qobject_incref).  But
> that feels gross.

It does.

>> where e = QTAILQ_LAST(&qov->stack, QStack)
>> 
>> Questions:
>> 
>> * How can e become NULL?
>> 
>>   The only place that pushes onto &qov->stack appears to be
>>   qmp_output_push_obj().  Can obviously push e with !e->value, but can't
>>   push null e.
>
> My understanding is that qov->stack corresponds to nesting levels of {}
> or [] in the JSON code.  The testsuite shows a case where the stack is
> empty as one case where e is NULL, but if e is non-NULL, I'm not sure
> whether e->value can ever be NULL.  I'll have to read the code more closely.
>
>> 
>> * What about qmp_output_last()?
>> 
>>   Why does qmp_output_first() check for !e, but not qmp_output_last()?
>> 
>>   My patch makes them less symmetric (bad smell).  Yours makes them more
>>   symmetric, but not quite.
>
> Which is awkward in its own right.
>
>> 
>> * How does this contraption work?
>
> Indeed. Without reading further, we're both shooting in the dark for
> something that makes tests pass, but without being a clean interface.
>
> How about this: go ahead with your series as proposed, with the squash
> hunk to tests/ to avoid the leak in the testsuite, but leaving the leak
> in qmp_output_get_object(), and we address the leak in a followup patch.

I'll add a FIXME explaining the reference counting bug, and the wider
problem.

What exactly do you want changed in tests?

>  refcnt is size_t, so on 32-bit platforms, it could roll over after 4G
> repeats of the leak and cause catastrophe,

Assertion failure in qnull_destroy_obj(), to be exact.

>                                            but I don't think we are
> outputting qnull() frequently enough for the leak to bite while we wait
> for a followup patch.

Agree.

> The followup patch(es) will then have to include some contract
> documentation, so that we track what we learned while investigating the
> code, and so that the next reader has more than just code to start from.

It's time to retrofit visitors with a proper contract.

Retrofitting a contract is much harder than starting with one, but we
got to play the hand we've been dealt.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]