[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bugs in printf/sprintf formatted output
From: |
Maciej W. Rozycki |
Subject: |
Re: Bugs in printf/sprintf formatted output |
Date: |
Thu, 20 Jun 2024 15:43:10 +0100 (BST) |
Hi Arnold,
I apologise not to reply right away. It's been hectic time here. I did
get your original e-mail and plan to verify your fixes by the end of this
week. Thank you for your patience.
Best,
Maciej
On Thu, 20 Jun 2024, arnold@skeeve.com wrote:
> Hi.
>
> Did you see my mail from ~ a week ago about a branch in the
> repo with fixes? I'd like you to verify that the changes meet
> your needs before merging, but I also can't wait indefinitely.
> Please let me know if you will get to this soon.
>
> Thanks,
>
> Arnold
>
> "Maciej W. Rozycki" <macro@redhat.com> wrote:
>
> > Hi,
> >
> > Please let me know if you need this bug report sent differently.
> >
> > I have been following the guidelines from the top-level README file and
> > the gawk(1) manual page. I believe the bugs are generic, however for the
> > record they have been observed with gawk 4.2.1 on POWER9/Linux, gawk 5.1.0
> > on RISC-V/Linux and gawk 4.1.4 on x86-64/Linux systems (as distributed)
> > and then the upstream master and a choice of earlier checkouts of gawk
> > built with GCC 14.0.1 on POWER9/Linux (in particular while bisecting a
> > problematic commit; see a note below).
> >
> > I have been recently working on improving test coverage for formatted
> > output verification in glibc and due to the humongous amount of data
> > processed, at least an order of magnitude larger than the whole glibc
> > repository takes, rather than the usual approach to have reference test
> > data pregenerated in the repository I chose to generate it on the fly
> > using gawk as an independent implementation, in particular in the bignum
> > mode (an exception is present I have been aware of for floating-point
> > input; see a note below). The resulting glibc test improvements will be
> > submitted upstream soon.
> >
> > In the course of writing the test cases I have checked various released
> > versions of gawk as well as upstream master and have come across numerous
> > corner cases that gawk does not handle correctly (which for the record I
> > have worked around by explicit handling). Some apply to versions of up to
> > 4.2.1 only (see a note below), but I have listed them for completeness as
> > that might be useful in the assessment.
> >
> > The issues with reproducers are in particular:
> >
> > - extraneous leading 0 produced for the alternative form with the o
> > conversion, e.g. { printf "%#.2o", 1 } produces "001" rather than "01",
> >
> > - unexpected 0 produced where no characters are expected for the input of
> > 0 and the alternative form with the precision of 0 and the integer
> > hexadecimal conversions, e.g. { printf "%#.x", 0 } produces "0" rather
> > than "",
> >
> > - missing + character in the non-bignum mode only for the input of 0 with
> > the + flag, precision of 0 and the signed integer conversions, e.g.
> > { printf "%+.i", 0 } produces "" rather than "+",
> >
> > - missing space character in the non-bignum mode only for the input of 0
> > with the space flag, precision of 0 and the signed integer conversions,
> > e.g. { printf "% .i", 0 } produces "" rather than " ",
> >
> > - for released gawk versions of up to 4.2.1 missing - character for the
> > input of -NaN with the floating-point conversions, e.g. { printf "%e",
> > "-nan" }' produces "nan" rather than "-nan",
> >
> > - for released gawk versions from 5.0.0 onwards + character output for the
> > input of -NaN with the floating-point conversions, e.g. { printf "%e",
> > "-nan" }' produces "+nan" rather than "-nan",
> >
> > - for released gawk versions from 5.0.0 onwards + character output for the
> > input of Inf or NaN in the absence of the + or space flags with the
> > floating-point conversions, e.g. { printf "%e", "inf" }' produces "+inf"
> > rather than "inf",
> >
> > - for released gawk versions of up to 4.2.1 missing + character for the
> > input of Inf or NaN with the + flag and the floating-point conversions,
> > e.g. { printf "%+e", "inf" }' produces "inf" rather than "+inf",
> >
> > - for released gawk versions of up to 4.2.1 missing space character for
> > the input of Inf or NaN with the space flag and the floating-point
> > conversions, e.g. { printf "% e", "nan" }' produces "nan" rather than
> > " nan",
> >
> > - for released gawk versions from 5.0.0 onwards + character output for the
> > input of Inf or NaN with the space flag and the floating-point
> > conversions, e.g. { printf "% e", "inf" }' produces "+inf" rather than
> > " inf",
> >
> > - for released gawk versions from 5.0.0 onwards the field width is ignored
> > for the input of Inf or NaN and the floating-point conversions, e.g.
> > { printf "%20e", "-inf" }' produces "-inf" rather than
> > " -inf",
> >
> > NB for released gawk versions of up to 4.2.1 floating-point conversion
> > issues apply to the bignum mode only, as in the non-bignum mode system
> > sprintf(3) is used. As from version 5.0.0 specialized handling has been
> > added for [-]Inf and [-]NaN inputs with commit 8dba5f4c9002 ("Output +inf,
> > +nan etc. also, so that output can be input. Doc, tests, fixed.") and the
> > issues listed apply to both modes. All the unmarked issues as well as
> > ones marked as present from 5.0.0 onwards are also there in the upstream
> > master.
> >
> > The `--posix' flag makes gawk versions from 5.0.0 onwards avoid the issue
> > with field width and the + character unconditionally output for the input
> > of Inf or NaN, however not the remaining issues. I realise there are some
> > limitations in Inf/NaN handling coming from gawk's legacy, so for example
> > the space flag may or may not be reasonably supported in the non-POSIX
> > mode, however I think the field width ought to be always respected, as it
> > will often be used to format tables, etc. and it's a regression from 4.2.1
> > too.
> >
> > FAOD I have used
> > <https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html> as
> > the normative reference.
> >
> > Please let me know if have any questions or comments or need any further
> > information. I'll be happy to verify any potential fixes before you have
> > pushed them to the upstream master.
> >
> > Maciej
> >
>
>