Re: [Nmh-workers] mhfixmsg on a pathological mail

nmh-workers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] mhfixmsg on a pathological mail

From:	Ralph Corderoy
Subject:	Re: [Nmh-workers] mhfixmsg on a pathological mail
Date:	Sat, 02 Sep 2017 13:51:59 +0100

Hi Ken,

> I would agree this is easy to miss, and it is confusing at the top of
> the man page where you could definitely read it as saying the whole
> file is a mhbuild composition file, rather than just the body.  Maybe
> you'd be willing to add some man page changes for 1.7?

I think 1.7 should be pushed out the door as soon as it's decided we're
happy with its new features, i.e. after -prefer's reversal as that looks
like Paul's favourite, and there's no faults that are trivially
triggered so users are worse off than 1.6.  And that's because there's a
ton of problems that will remain, but since they've been there a while,
e.g. mhbuild(1)'s confusion, typically without receiving complaint, they
can wait a bit longer.

> Hm.  You know ... I can't actually get that to happen if #< is the
> FIRST line.  The second line, yes.  Looking at m_getfld() it seems
> that if we get something that isn't a header, we simply punt over it
> (and silently eat it) and go on the assumption everything after that
> point is the body.  So that seems right.

I agree that's what's happening;  the SEGV is only after a blank line.
Silently ignoring seems a bug IMO.  How much better if there was an
error about `#<foo/bar' being an invalid email header.  :-)

> > If I add a blank line before `#<' then we're in business; I get
> > foo/bar.  No CTE.  Is that OK because the default is 8bit?
>
> Ummm .... no.  It's because it has no idea what foo/bar is and is
> probably falling through some switch statement somewhere.

scan_content() perhaps.

> Seriously, Ralph, what are you trying to accomplish here, other than
> delay the release of 1.7? :-)

I'm trying to point out these are not 1.7 stoppers.  :-)
Nor's this.

    $ printf '%s\n' '' '#<foo/bar/xyzzy' 'Wot no wizard.' |
    > uip/mhbuild -
    MIME-Version: 1.0
    Content-Type: foo/bar
    Content-ID: <address@hidden>
    Content-MD5: +kx1Z6EwfyHjstJbq7OEHQ==

    Wot no wizard.
    $

Nor multiple encoding requests that violate the grammar being ignored,
with first one wins.

    $ printf '%s\n' '' '#<text/plain *qp *b64' a£d |
    > uip/mhbuild - |
    > grep -i content-transfer-encoding
    Content-Transfer-Encoding: quoted-printable
    $
    $ printf '%s\n' '' '#<text/plain *b64 *qp' a£d |
    > uip/mhbuild - |
    > grep -i content-transfer-encoding
    Content-Transfer-Encoding: base64

Nor multiple comments allowed, unlike the grammar, but only the one
positioned according to the grammar makes it through.

    $ printf '%s\n' '' '#<text/plain (foo) <id> (bar) [desc] (xyzzy) {disp}' 
a£d |
    > uip/mhbuild - |
    > egrep 'foo|bar|xyzzy'
    Content-Type: text/plain; charset="UTF-8" (foo)
    $
    $ printf '%s\n' '' '#<text/plain <id> (bar) [desc] (xyzzy) {disp}' a£d |
    > uip/mhbuild - |
    > egrep 'foo|bar|xyzzy'
    $ 

> > Report the problem to the user and either stop, or skip to the
> > "next" item, e.g. email to retrieve with POP3.  If those reports
> > turn out to be common, e.g. a behemoth like Gmail does it wrong,
> > then add code to violate the RFC.  If only oddball users need it,
> > then put it behind an option for their ~/.mh_profile.
>
> Well, my beef there is actual users complain when they get those
> warnings.

Good, else we'd never know they were triggered.  They are our guinea
pigs.  Hopefully, it won't be our fault too often, and when we can point
out it's a violation by something else then that may pacify them,
especially if we think it's likely to recur enough to workaround.  But
that's different from being slack in the first place, letting much pass
silently, allowing errors to compound, and discarding bits we don't
quite recognise in the hope that nobody wanted them.

> My problem is that it's not clear how the calling code can "know" if
> it wants \r\n or just \n.  Consider the case that got us here: user
> wanted to run mhfixmsg(1) in their MTA in a step where \r\n appeared.
> How is mhfixmsg supposed to know for that one case it needs to know
> about \r\n?

If mhfixmsg(1) wanted to attempt to cope with that unusual case then it
could attempt to lex multiple times?  I don't think a program that's
trying to fix data that violates a grammar whilst using a parser for the
grammar is a good example.  :-)

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Nmh-workers] mhfixmsg on a pathological mail, Ralph Corderoy, 2017/09/01
- Re: [Nmh-workers] mhfixmsg on a pathological mail, Ken Hornstein, 2017/09/01
  - Re: [Nmh-workers] mhfixmsg on a pathological mail, Ralph Corderoy <=
- Re: [Nmh-workers] mhfixmsg on a pathological mail, Ralph Corderoy, 2017/09/02

Prev by Date: Re: [Nmh-workers] mhfixmsg on a pathological mail
Next by Date: [Nmh-workers] multiple -prefer options
Previous by thread: Re: [Nmh-workers] mhfixmsg on a pathological mail
Next by thread: Re: [Nmh-workers] mhfixmsg on a pathological mail
Index(es):
- Date
- Thread