bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bugs in printf/sprintf formatted output


From: Maciej W. Rozycki
Subject: Re: Bugs in printf/sprintf formatted output
Date: Thu, 20 Jun 2024 15:43:10 +0100 (BST)

Hi Arnold,

 I apologise not to reply right away.  It's been hectic time here.  I did 
get your original e-mail and plan to verify your fixes by the end of this 
week.  Thank you for your patience.

 Best,

  Maciej

On Thu, 20 Jun 2024, arnold@skeeve.com wrote:

> Hi.
> 
> Did you see my mail from ~ a week ago about a branch in the
> repo with fixes? I'd like you to verify that the changes meet
> your needs before merging, but I also can't wait indefinitely.
> Please let me know if you will get to this soon.
> 
> Thanks,
> 
> Arnold
> 
> "Maciej W. Rozycki" <macro@redhat.com> wrote:
> 
> > Hi,
> >
> >  Please let me know if you need this bug report sent differently.
> >
> >  I have been following the guidelines from the top-level README file and 
> > the gawk(1) manual page.  I believe the bugs are generic, however for the 
> > record they have been observed with gawk 4.2.1 on POWER9/Linux, gawk 5.1.0 
> > on RISC-V/Linux and gawk 4.1.4 on x86-64/Linux systems (as distributed) 
> > and then the upstream master and a choice of earlier checkouts of gawk 
> > built with GCC 14.0.1 on POWER9/Linux (in particular while bisecting a 
> > problematic commit; see a note below).
> >
> >  I have been recently working on improving test coverage for formatted 
> > output verification in glibc and due to the humongous amount of data 
> > processed, at least an order of magnitude larger than the whole glibc 
> > repository takes, rather than the usual approach to have reference test 
> > data pregenerated in the repository I chose to generate it on the fly 
> > using gawk as an independent implementation, in particular in the bignum 
> > mode (an exception is present I have been aware of for floating-point 
> > input; see a note below).  The resulting glibc test improvements will be 
> > submitted upstream soon.
> >
> >  In the course of writing the test cases I have checked various released 
> > versions of gawk as well as upstream master and have come across numerous 
> > corner cases that gawk does not handle correctly (which for the record I 
> > have worked around by explicit handling).  Some apply to versions of up to 
> > 4.2.1 only (see a note below), but I have listed them for completeness as 
> > that might be useful in the assessment.
> >
> >  The issues with reproducers are in particular:
> >
> > - extraneous leading 0 produced for the alternative form with the o 
> >   conversion, e.g. { printf "%#.2o", 1 } produces "001" rather than "01",
> >
> > - unexpected 0 produced where no characters are expected for the input of 
> >   0 and the alternative form with the precision of 0 and the integer 
> >   hexadecimal conversions, e.g. { printf "%#.x", 0 } produces "0" rather 
> >   than "",
> >
> > - missing + character in the non-bignum mode only for the input of 0 with 
> >   the + flag, precision of 0 and the signed integer conversions, e.g.
> >   { printf "%+.i", 0 } produces "" rather than "+",
> >
> > - missing space character in the non-bignum mode only for the input of 0 
> >   with the space flag, precision of 0 and the signed integer conversions, 
> >   e.g. { printf "% .i", 0 } produces "" rather than " ",
> >
> > - for released gawk versions of up to 4.2.1 missing - character for the 
> >   input of -NaN with the floating-point conversions, e.g. { printf "%e", 
> >   "-nan" }' produces "nan" rather than "-nan",
> >
> > - for released gawk versions from 5.0.0 onwards + character output for the 
> >   input of -NaN with the floating-point conversions, e.g. { printf "%e", 
> >   "-nan" }' produces "+nan" rather than "-nan",
> >
> > - for released gawk versions from 5.0.0 onwards + character output for the 
> >   input of Inf or NaN in the absence of the + or space flags with the 
> >   floating-point conversions, e.g. { printf "%e", "inf" }' produces "+inf" 
> >   rather than "inf",
> >
> > - for released gawk versions of up to 4.2.1 missing + character for the
> >   input of Inf or NaN with the + flag and the floating-point conversions, 
> >   e.g. { printf "%+e", "inf" }' produces "inf" rather than "+inf",
> >
> > - for released gawk versions of up to 4.2.1 missing space character for
> >   the input of Inf or NaN with the space flag and the floating-point 
> >   conversions, e.g. { printf "% e", "nan" }' produces "nan" rather than 
> >   " nan",
> >
> > - for released gawk versions from 5.0.0 onwards + character output for the 
> >   input of Inf or NaN with the space flag and the floating-point 
> >   conversions, e.g. { printf "% e", "inf" }' produces "+inf" rather than 
> >   " inf",
> >
> > - for released gawk versions from 5.0.0 onwards the field width is ignored 
> >   for the input of Inf or NaN and the floating-point conversions, e.g.
> >   { printf "%20e", "-inf" }' produces "-inf" rather than
> >   "                -inf",
> >
> >  NB for released gawk versions of up to 4.2.1 floating-point conversion 
> > issues apply to the bignum mode only, as in the non-bignum mode system 
> > sprintf(3) is used.  As from version 5.0.0 specialized handling has been 
> > added for [-]Inf and [-]NaN inputs with commit 8dba5f4c9002 ("Output +inf, 
> > +nan etc. also, so that output can be input. Doc, tests, fixed.") and the 
> > issues listed apply to both modes.  All the unmarked issues as well as 
> > ones marked as present from 5.0.0 onwards are also there in the upstream 
> > master.
> >
> >  The `--posix' flag makes gawk versions from 5.0.0 onwards avoid the issue 
> > with field width and the + character unconditionally output for the input 
> > of Inf or NaN, however not the remaining issues.  I realise there are some 
> > limitations in Inf/NaN handling coming from gawk's legacy, so for example 
> > the space flag may or may not be reasonably supported in the non-POSIX 
> > mode, however I think the field width ought to be always respected, as it 
> > will often be used to format tables, etc. and it's a regression from 4.2.1 
> > too.
> >
> >  FAOD I have used 
> > <https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html> as 
> > the normative reference.
> >
> >  Please let me know if have any questions or comments or need any further 
> > information.  I'll be happy to verify any potential fixes before you have 
> > pushed them to the upstream master.
> >
> >   Maciej
> >
> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]