[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Nmh-workers] mhshow/test-charset failures in nmh-1.7 (`Can't conver
From: |
Ralph Corderoy |
Subject: |
Re: [Nmh-workers] mhshow/test-charset failures in nmh-1.7 (`Can't convert ?us-ascii to UTF-8') |
Date: |
Thu, 23 Nov 2017 16:22:41 +0000 |
Hi Leonardo,
> After finding that having the `libiconv' package installed made a
> difference I first looked if the several nmh binaries was linked
> against the GNU iconv(3) or the NetBSD iconv(3) and in both cases it's
> correctly linked to the NetBSD iconv(3).
So NetBSD has two iconv implementations available, and both supply a
library and iconv(1)? Can both packages be installed at the same time?
It sounds like it. And nmh correctly picks the "native" NetBSD library.
Which package provides the iconv in $PATH when both are installed? And
is /usr/pkg/bin/iconv the other one?
> 28 #### For unknown reasons, the parameter values checks fail on the
> 29 #### FreeBSD10 buildbot. It doesn't support EBCDIC-US, which is used
> 30 #### by the checks, so check for that. Though that doesn't seem to be
> 31 #### the reason.
> 32 printf '\xe4' | iconv -f EBCDIC-US -t UTF-8 >/dev/null 2>&1 ||
> 33 skip_param_value_checks=1
So with your original report, this test passed, skip_param_value_checks
remained 0, and thus the failing test was later run.
> And, with NetBSD iconv(1) I have:
>
> % printf '\xe4' | /usr/bin/iconv -f EBCDIC-US -t UTF-8
> U
Good. So that's what the above iconv test used because...
> ...while with iconv(1) provided by the `libiconv' package:
>
> % printf '\xe4' | /usr/pkg/bin/iconv -f EBCDIC-US -t UTF-8
> /usr/pkg/bin/iconv: conversion from EBCDIC-US unsupported
> /usr/pkg/bin/iconv: try '/usr/pkg/bin/iconv -l' to get the list of supported
> encodings
> % echo $?
> 1
> So, in if GNU iconv(1) is available `$skip_param_value_checks' is
> set to 1.
Yes, on your platform, if it's the iconv chosen by the user's PATH.
> I'm now curious if apart FreeBSD and NetBSD with `libiconv' package
> installed what happens on other platforms, just checking the exit
> status of:
>
> $ printf '\xe4' | iconv -f EBCDIC-US -t UTF-8
>
> will be probably enough.
Don't quite understand the question. Here on Arch Linux, ICONV_ENABLED
is 1 so that `printf | iconv' does get run and works so the last two
tests don't get skipped. That's with iconv(1) from glibc 2.26.
> If the exit status is 0 and then, in test-charset context
> `$skip_param_value_checks' is 0, what happens if you try (this is
> stolen entirely from 'replacement character in parameter value' test
> in test-charset):
>
> $ printf "Subject: invalid parameter value charset\nMIME-Version:
> 1.0\nContent-Type: text/plain; charset*=invalid'
> '%%0Dus-ascii\n" | \
> mhshow -file - | cat
The test passes here, so I get the expected output. (At the command
line I get slightly different, but that's my ~/.mh_profile, etc.,
kicking in.)
start_test 'replacement character in parameter value'
#### The output of this test doesn't show it, but it covers the
#### noiconv: portion of get_param_value().
cat > $msgfile <<'EOF'
Subject: invalid parameter value charset
MIME-Version: 1.0
Content-Type: text/plain; charset*=invalid''%0Dus-ascii
EOF
cat > $expected <<EOF
[ Message inbox:12 ]
Subject: invalid parameter value charset
MIME-Version: 1.0
[ part - text/plain - 0B ]
EOF
> Here, I have:
>
> | Subject: invalid parameter value charset
> |
> | mhshow: Can't convert ?us-ascii to UTF-8
> | mhshow: unable to convert character set from ?us-ascii, continuing...
> | [ part - text/plain - 0B ]
It seems reasonable that `?us-ascii', with a U+3F question mark at the
start, is an invalid source charset. Yet mhshow is calling
iconv_open(3) with it here and that's happy. If I change
content_charset()'s
ret_charset = get_param(ct->c_ctinfo.ci_first_pm, "charset", '?', 0);
to use a `x' instead then I get a similar mhshow error to you, but with
`xus-ascii'. So what's special about the question mark to glibc's
iconv_open() that gives `?' a free ride?
I also find this works, oddly.
$ printf '\344' |
> iconv -f EBCDIC-US -t '???us-as???cii???'; printf \\n
U
$
I run out of answers at this point and will do a bit more digging,
unless someone else here already knows. Was the `?' replacement
character chosen deliberately in content_charset() to exploit this?
--
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy
- [Nmh-workers] mhshow/test-charset failures in nmh-1.7 (`Can't convert ?us-ascii to UTF-8'), Leonardo Taccari, 2017/11/20
- Re: [Nmh-workers] mhshow/test-charset failures in nmh-1.7 (`Can't convert ?us-ascii to UTF-8'), Ralph Corderoy, 2017/11/20
- Re: [Nmh-workers] mhshow/test-charset failures in nmh-1.7 (`Can't convert ?us-ascii to UTF-8'), Robert Elz, 2017/11/24
- Re: [Nmh-workers] mhshow/test-charset failures in nmh-1.7 (`Can't convert ?us-ascii to UTF-8'), David Levine, 2017/11/24
- Re: [Nmh-workers] mhshow/test-charset failures in nmh-1.7 (`Can't convert ?us-ascii to UTF-8'), Robert Elz, 2017/11/25
- Re: [Nmh-workers] mhshow/test-charset failures in nmh-1.7 (`Can't convert ?us-ascii to UTF-8'), Ralph Corderoy, 2017/11/25
- Re: [Nmh-workers] mhshow/test-charset failures in nmh-1.7 (`Can't convert ?us-ascii to UTF-8'), David Levine, 2017/11/26