[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: test results differents between the perl and XS parsers
From: |
Gavin Smith |
Subject: |
Re: test results differents between the perl and XS parsers |
Date: |
Tue, 17 Nov 2020 07:27:45 +0000 |
User-agent: |
Mutt/1.9.4 (2018-02-28) |
On Tue, Nov 17, 2020 at 12:18:31AM +0100, Patrice Dumas wrote:
> On Mon, Nov 16, 2020 at 04:55:42PM +0000, Gavin Smith wrote:
> >
> > The simplest solution to this issue seems to be a "normalisation" phase
> > (for the tests only) where the line number information is duplicated.
>
> I did that, and also for other cases where the two parsers differed,
> including one case where the XS parser duplicated less its results.
>
> Now I am down to
>
> --- t/results/indices/encoding_index_latin1.pl 2020-11-16 23:51:39.412993646
> +0100
> +++ t/results/indices/encoding_index_latin1.pl.new 2020-11-17
> 00:00:54.879434507 +0100
> @@ -158,7 +158,7 @@
> 'contents' => [
> {
> 'parent' => {},
> - 'text' => "\x{e9} \x{e9}"
> + 'text' => 'é é'
> }
> ],
> 'extra' => {
>
> and same for encoding_index_latin1_enable_encoding,
> encoding_index_utf8 and other similar tests.
>
> It seems like it is the only case of accented commands in parsed text.
> Any idea on what's going on?
Thanks for doing this; I had never expected that this would be possible or
practical.
The two strings appear to be the same string. The question is, why are
they output differently? I don't know, and I will look into it when I
have time. Things to look at include whether the string is stored internally
as UTF-8 or Latin-1, and locale settings when the string is output.