nmh-workers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] mhfixmsg on a pathological mail


From: Ralph Corderoy
Subject: Re: [Nmh-workers] mhfixmsg on a pathological mail
Date: Fri, 01 Sep 2017 12:48:32 +0100

Hi Ken,

> > Well, at least it does if I'm doing comp, whatnow, mime, edit.  If I
> > run mhbuild(1) then it always gives quoted-printable.
> > 
> >     $ mhbuild -
> >     #<text/plain *b64
> >     a£d
> >     w£z
> >     MIME-Version: 1.0
> >     Content-Type: text/plain; charset="UTF-8"
> >     Content-ID: <address@hidden>
> >     Content-MD5: wrfRnlkZxzaLuNL9h63JVA==
> >     Content-Transfer-Encoding: quoted-printable
> > 
> >     a=C2=A3d
> >     w=C2=A3z
>
> That does not look like a valid mhbuild composition file?
> Specifically, you're missing some headers

The first mention mhbuild(1) mentions of the input format is

    An mhbuild “composition file” is just a file containing plain text
    that is interspersed with various mhbuild directives.

In the next paragraph it says "Basically, the body contains one or more
contents", and that's the first suggestion there might be something
other than the body.  The grammar at the end starts

    The following is the formal syntax of a mhbuild “composition file”.

        body         ::=     1*(content | EOL)

suggesting it's nothing but a body as that's the start symbol.

> I suspect it didn't read the directive at all.
>
> > Also, it SEGVs without the `/plain'.
>
> Whoops!  Yeah, will fix that for 1.7.

Which suggests it did read the `#<...' line as a directive rather than
just some line of text to skip, or a arbitrary header to skip before the
blank line separating the headers and body.

This pipeline behaves as you suggest.  A directive as the first line
produces text/plain regardless, but it must have known what the `#<' was
because it doesn't appear as part of the text/plain's content.

    $ printf '%s\n' '#<foo/bar' a£d w£z |
    > valgrind -q uip/mhbuild -
    MIME-Version: 1.0
    Content-Type: text/plain; charset="UTF-8"
    Content-ID: <address@hidden>
    Content-MD5: wrfRnlkZxzaLuNL9h63JVA==
    Content-Transfer-Encoding: quoted-printable

    a=C2=A3d
    w=C2=A3z
    $ 

If I add a blank line before `#<' then we're in business;  I get
foo/bar.  No CTE.  Is that OK because the default is 8bit?  mhbuild(1)
says "If an integrity check is being associated with each content by
using the -check switch, then mhbuild will encode each content with a
transfer encoding, even if the content contains only 7-bit data".  My
~/.mh_profile has -check.

    $ printf '%s\n' '' '#<foo/bar' a£d w£z |
    > valgrind -q $T/1*/nmh/uip/mhbuild -
    MIME-Version: 1.0
    Content-Type: foo/bar
    Content-ID: <address@hidden>
    Content-MD5: 7P4/kYRXtg6WrA/hpJ4Phw==

    a£d
    w£z
    $ 

Requesting base64 works.

    $ printf '%s\n' '' '#<foo/bar *b64' a£d w£z |
    > valgrind -q $T/1*/nmh/uip/mhbuild -
    MIME-Version: 1.0
    Content-Type: foo/bar
    Content-ID: <address@hidden>
    Content-MD5: 7P4/kYRXtg6WrA/hpJ4Phw==
    Content-Transfer-Encoding: base64

    YcKjZAp3wqN6Cg==

Back to without the `*b64', and missing the subtype causes a NULL
dereference.  Adding `*b64' back doesn't help.

    $ printf '%s\n' '' '#<foo' a£d w£z |
    > valgrind -q $T/1*/nmh/uip/mhbuild -
    ==7905== Invalid read of size 1
    ==7905==    at 0x4C2F142: strlen (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==7905==    by 0x10F6D4: build_headers (mhbuildsbr.c:1670)
    ==7905==    by 0x112E4D: build_mime (mhbuildsbr.c:571)
    ==7905==    by 0x10E70B: main (mhbuild.c:345)
    ==7905==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
    ==7905== 
    Segmentation fault (core dumped)

> > > > We don't insist on CRLF when receiving RFC 5322, right.
> > >
> > > Right ... I was just musing that maybe we should.
> > 
> > My inner facist system administrator says yes.  Postel's maxim is
> > wrong.
> > https://tools.ietf.org/html/draft-thomson-postel-was-wrong-00
>
> Sigh.
>
> I understand where you are coming from, really.  But ... practical
> concerns raise their ugly heads, again.
>
> First ... when we get invalid input, how should we react?  It's a fair
> question.

Yes.  Report the problem to the user and either stop, or skip to the
"next" item, e.g. email to retrieve with POP3.  If those reports turn
out to be common, e.g. a behemoth like Gmail does it wrong, then add
code to violate the RFC.  If only oddball users need it, then put it
behind an option for their ~/.mh_profile.

We need to cope with errors anyway, e.g. I/O problems on the fsync(2)
means POP3's DELE shouldn't be issued.

> Secondly, we seem to be alone in terms of strictness with regards to
> parsing MIME messages; other MUAs seem to handle MIME content with
> slightly irregular grammar just fine.

But probably varyingly.  And that tends to a rough superset of
variations being implemented over time, without the interaction of those
violations being considered.

> I'm not suggesting we be able to handle anything that /dev/urandom
> puts out

Well, we should without SEGV.  :-)

> but it seems that erroring out and refusing to parse email makes it
> difficult for us to operate in the real world.

We already point out problems and continue in some cases because it's a
common fault, e.g. suppress_extraneous_trailing_semicolon_warning,
though that only seems to trigger on spam for me.  But we do report the
problem rather than silently ignore.

> > > If we did that, a regular expression to handle a line ending with
> > > \r\n would be trivial.
> > 
> > If it were to allow /\r?\n/ then I think it should insist on
> > consistency for all the lines based on the first.  But really, the
> > lexer should be told which one of the two is valid at the start.
>
> Another switch to add to all programs?  Ugh.

No, I mean that the code calling the lexer knows at the start what line
ending is acceptable and should tell the lexer, e.g. mbox lexing wants
/\n/ and any /\r/ seen is part of the line, not the line's terminator.
Nothing required from the user.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]