lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Unit tests hygiene


From: Vadim Zeitlin
Subject: Re: [lmi] Unit tests hygiene
Date: Fri, 3 Jun 2022 19:47:32 +0200

On Fri, 3 Jun 2022 16:29:18 +0000 Greg Chicares <gchicares@sbcglobal.net> wrote:

GC> On 5/21/22 18:34, Vadim Zeitlin wrote:
GC> > 
GC> >  Some time ago I proposed to run lmi unit tests under UBSAN, the undefined
GC> > behaviour sanitizer, and I've finally tried doing this. I used the
GC> > autotools build, as it was simple to do there because it's enough to just
GC> > append CXX='g++ -fsanitize=undefined' to configure command line and then
GC> > build as usual.
GC> 
GC> I believe something like this will do the same in 'workhorse.make'
GC> (and, notably, $(debug_flag) gets passed to the linker as well as
GC> to the compiler):
GC> 
GC> -debug_flag := -ggdb
GC> +# debug_flag := -ggdb -fsanitize=undefined -fno-sanitize-recover=all 
--param=max-vartrack-size=60000000
GC> +
GC> +debug_flag := -ggdb -fsanitize=undefined -fno-sanitize-recover=all 
-fno-var-tracking-assignments

 I don't think it's a good idea to always use UBSAN unconditionally. It
takes somewhat longer to compile, but also makes things much slower (sorry,
not sure by how much exactly, but it's times, not percents) during
run-time. Finally, it's incompatible with any other sanitizers you might
want to use. So I believe it should be only used in a separate build type.

GC> My first question is whether you ever got 'i7702.cpp' to finish
GC> compiling. It's gone for many minutes here already, even with
GC> the second (not '#'-commented) set of flags above.

 I didn't have any problem with this or any other file, but I didn't use
-fno-sanitize-recover=all because by the time I learnt about it, I had
already built everything without it and I didn't want to rebuild again. I
can retry building with it too, but not now/today, unfortunately.

GC> >  After doing this I did find (only!) 2 errors. First one occurs in
GC> > input_test:
GC> > 
GC> > input_test.cpp:165:5: runtime error: load of value 25000, which is not a 
valid value for type 'oenum_alb_or_anb'
GC> > input_test.cpp:167:5: runtime error: load of value 25000, which is not a 
valid value for type 'oenum_alb_or_anb const'
GC> > 
GC> > This happens for the part of the code for which we already had had to
GC> > disable a clang warning and the fact that UBSAN doesn't like it neither
GC> > makes me wonder if the comment saying that "C++ allows that" is actually
GC> > correct.
GC> 
GC> The specific code is
GC>   return static_cast<T>(bourn_cast<std::underlying_type_t<T>>(d));
GC> whose behavior was once unspecified, but is now undefined:
GC>   https://cplusplus.github.io/CWG/issues/1766.html
GC> 
GC> > In any case, this is not a real problem, of course, but perhaps it
GC> > would be better to remove this part of the test to avoid to deal with the
GC> > compilation and UB-testing issues arising due to it? Or is it
GC> > representative of what the actual code does and so you consider it
GC> > important to keep it? FWIW I couldn't find any way to avoid UBSAN warning
GC> > about doing this.
GC> 
GC> It is desirable for UBSAN to complain about this. It would be
GC> even more desirable for the compiler to issue a diagnostic,
GC> but of course it doesn't.

 Well, clang does, so if the same issue ever arises in the rest of the
code, we'd know about it because it would break the CI build.


GC> >  The second error is in irc7702a_test:
GC> > 
GC> > any_member.hpp:177:33: runtime error: load of value 2, which is not a 
valid value for type 'bool'
GC> 
GC> Can you easily tell on which line of 'irc7702a_test.cpp' that occurs?

 I should be able to do it when I have time to rebuild things later, but I
don't think it matters because the problem happens whenever mec_state is
created.

GC> > which happens because mec_state::C3_init_is_mc boolean field is
GC> > uninitialized when we "use" it value_cast() to which it is passed by
GC> > value when we call it from that line (which is in holder::assign()) of
GC> > any_member.hpp. So this isn't a real problem neither, of course, because 
we
GC> > never actually use this value and an optimizing compiler wouldn't even
GC> > bother reading the value before calling value_cast(), but it does look 
like
GC> > an uninitialized variable access to UBSAN. I see a couple of potentially
GC> > worthwhile ways to fix it:
GC> > 
GC> > - Either we can actually initialize all the fields of mec_state by adding
GC> >   "{}" to their declarations.
GC> 
GC> Apparently I was assuming that all code paths would initialize this
GC> field to a valid boolean value, and I'd like to know specifically
GC> why that's untrue.

 The field is initialized once mec_state ctor finishes executing. But the
"uninitialized" access happens while it's executing and, in fact, while
it's initializing the field. This explains why it's not really a problem:
the field will get initialized during the very next access, but reading it
initially in any_member.hpp uses the value that hasn't been initialized
just yet.

GC> I hesitate to default-initialize everything just to make the problem
GC> "go away". But ultimately we probably will want to default-initialize
GC> everything, after diagnosing the root cause and remediating it.

 Yes, I think default initializing is the right thing to do by default and
should only be avoided when there are compelling performance reasons not to
do it. In fact I really wish we could have "class foo { ... } = default" in
C++ to mean that all class members should be default initialized. But I
guess we'll have to wait for meta-classes for that.

GC> > - Or we could change "To" in value_cast() to "To const&" to ensure that 
the
GC> >   value never needs to be accessed.
GC> 
GC> Can you readily state the location of the offending value_cast()?

 Yes, it's the any_member.hpp line mentioned above, i.e. 177.

 Just to be totally clear: we _copy_ the value of the not yet initialized
mec_state::C3_init_is_mc while we're initializing it. Copying it involves
reading it which, formally, counts as accessing uninitialized memory. By
not copying it here, this access would be avoided.

 In fact, thinking more about this, I really believe that we should
implement not one or the other suggestion, but both of them. Default
initializing is good and passing objects by value is bad, so let's do the
former and not do the latter even if it doesn't really change anything
much because why not?

 Thanks for looking into this!
VZ

Attachment: pgpt3wj3Um2J2.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]