nmh-workers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [nmh-workers] INCing of email archives


From: Ralph Corderoy
Subject: Re: [nmh-workers] INCing of email archives
Date: Thu, 25 Jul 2019 09:19:17 +0100

Hi Bakul,

> Once in a while I download email archives of some mailing list
> and unpack them using "inc -file <archive-file>". But more
> than once I have seen that inc gets confused and doesn't
> unpack the whole thing. The cause seems to be a line starting
> with From in some message body.

Then it isn't any of the four mbox formats described at
https://en.wikipedia.org/wiki/Mbox#Family ?

> Ideally inc should look that a "From ..." line is immediately followed
> by header lines.  And if this is not the case, assume it is in the
> message body.

I agree that would be one heuristic to help, but it would also have
problems:

    From the outset, was clear we failed 42
    times: the first on attempting to read faulty input...

> fix() {
>       grep -n '^From .*[^0-9]$' $1 | sed 's/:.*/s|^|>|/' > ,$1
>       if [ -s ,$1 ]; then echo wq >> ,$1; cat ,$1 | ed $1; fi
>       rm ,$1
> }
>
> This prepends a > to any line beginning with "From "and not
> ending with a digit.

    sed -i '/^From .*[^0-9]$/s/^/> /' "${1?}"

-- 
Cheers, Ralph.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]