lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] PATCH: tests build fixes for clang


From: Vadim Zeitlin
Subject: Re: [lmi] PATCH: tests build fixes for clang
Date: Mon, 8 Mar 2021 19:19:19 +0100

On Mon, 8 Mar 2021 16:20:30 +0000 Greg Chicares <gchicares@sbcglobal.net> wrote:

GC> On 3/7/21 1:10 PM, Vadim Zeitlin wrote:
GC> > 
GC> >  I'd like to submit a series of small fixes to the tests allowing to build
GC> > them with clang, please see 
https://github.com/let-me-illustrate/lmi/pull/173
GC> 
GC> 52e90cdc6 Use std::next() rather than "1+" with raw literal strings
GC> 
GC> I'd like to find a different way to avoid the clang warning here.

 Unfortunately I couldn't find anything better. Unless, of course, you
consider just disabling the warning to be better, but the problem is that
I'd like to use "1 + R"aw(" construct in more places and disabling the
warning every time before/after the raw string is, of course, not ideal, so
it would have to be disabled globally even though it looks like a
potentially useful warning with potential to catch some typos.

GC> AFAICT, clang is looking for cases like
GC>   int error_code = 37;
GC>   std::string error_msg = "Error code is " + error_code;
GC> which might have hoped to produce
GC>   "Error code is 37"
GC> but actually initializes a string with
GC>   char const* p = "Error code is ";   // 'Error code is \0'
GC>   char const* p_plus_offset = p + 37; // points to ʻOumuamua

 Yes, exactly.

GC> But this case is
GC>   char const* error_msg = "\n\n\nError code is 37";
GC>   int offset = 3; // Skip three leading newlines
GC>   std::string(offset + error_msg); // 'Error code is 37'
GC> so it's a false positive. This warning is overzealous.

 Yes, but the compiler can't know it. At least I don't see any really
useful heuristic for distinguishing the 2 cases.

GC> Working around it with std::next() seems unnatural to me.

 I was actually glad to stumble upon the std::next() solution because I
thought it was more clear than "1+" hack, as we really want to use the
pointer to the next string element. I.e. I won't pretend that I find it
especially beautiful, but I did hope that you'd prefer it to the old "1+"
approach.

GC> The rationale for std::next() is that not all iterators are pointers.
GC> But there are no iterators here, only pointers.

 But the fact that not all iterators are pointers doesn't invalidate the
fact that this concrete pointer is an iterator. I.e. I don't see any
contradiction here, we use std::next() with an iterator which happens to be
a pointer.

GC> Here we have a string literal that's written in such a way as to
GC> require dropping its first character. The problem isn't choosing
GC> one or another means of dropping the first character--instead,
GC> the problem, IMO, is needing to drop it.
GC> 
GC> The root cause is using raw string literals, whose syntax is just
GC> frankly horrible.

 Yes, it's regrettable that it's so limited. But there is nothing we can do
about it.

GC> Here, the benefit (not having to write "\n" for newline)

 I think you're drastically underestimating the benefit. Even though not
having to write "\n"s is very much appreciable on its own, this is not just
about the newlines, but also about the quotes and backslashes that can be
difficult to properly quote/unquote in arbitrarily long strings.

 Practically speaking, raw strings are great because they allow to just
copy-paste between the editor and the terminal. So I'd very much like to
continue using them...

GC> is outweighed by the damage (having to drop the first
GC> newline) and the maintenance cost (having to drop the first
GC> newline in a different way, because some compiler spuriously
GC> warned against the original way).

 ... and would strongly prefer to live without this clang warning (i.e.
disable it globally at makefile level) rather than not to use raw string
literals. They absolutely have their problems, but they're still much
better than normal strings for multiline chunks of text. IMO f5d41a09e
(Replace raw string literals, 2021-03-08) makes the source much less
readable and maintainable and I wish it could be reverted.

GC> Raw string literals really do simplify writing regexes, but in
GC> general they're not worth the trouble, and should be avoided,
GC> In My Opinion. It took me several tries to replace them
GC> correctly with classic "C" equivalents in commit f5d41a09ecf2.

 This is a problem of the classic C string literals and not raw strings,
however. I'd argue that the original version of rate_table_test.cpp was
perfectly clear, with the possible exception of 1+/std::next hack, i.e. the
string contents was clearly visible (and could be easily copied to be
pasted elsewhere). The latest version contains a lot of extra visual noise
(quotes, backslashes, "n"s) which makes it much more difficult to parse
(and, of course, it can't be copied anywhere).

 Please reconsider proscribing raw strings!
VZ

Attachment: pgpN9sV7yYl2l.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]