bug#7155: [md5sum] does not accept

From: Pádraig Brady
Subject: bug#7155: [md5sum] does not accept
Date: Wed, 14 Sep 2011 16:13:03 +0100
Date: Wed, 14 Sep 2011 16:13:03 +0100

On 09/14/2011 03:12 PM, Jim Meyering wrote:
> Pádraig Brady wrote:
>> severity 7155 wishlist
>> tags 7155 + notabug
>> Comments below.
>> On 10/16/2010 11:37 PM, Pádraig Brady wrote:
>>> On 16/10/10 20:37, Rimas Kudelis wrote:
>>>> On Sun, 03 Oct 2010 23:27:24 +0100, Pádraig Brady <address@hidden>
>>>> wrote:
>>>>> On 03/10/10 20:24, Rimas Kudelis wrote:
>>>>>> Hi,
>>>>>> I have a little problem with md5sum.
>>>>>> A FreeBSD box generates an md5 sum of a file, which I'm later trying to
>>>>>> check on a Linux box. The problem is that what FreeBSD's md5 outputs is
>>>>>> slightly different from what Linux's md5sum expects, which makes md5sum
>>>>>> complain. The difference is really trivial: md5 outputs one space
>>>>>> between the sum and the file name, and md5sum outputs/expects two:
>>>>> md5 seems to output a different format here.
>>>>> $ head -n1 /etc/motd
>>>>> FreeBSD 8.0-RELEASE-p3 (GENERIC) #0: Wed May 26 05:45:12 UTC 2010
>>>>> $ md5sum --version | head -n1
>>>>> md5sum (GNU coreutils) 8.3
>>>>> $ md5 file | tee t.md5
>>>>> MD5 (file) = b85d6fb9ef4260dcf1ce0a1b0bff80d3
>>>>> $ md5sum -c t.md5
>>>>> file: OK
>>>>> Could you verify what md5 utility you're using exactly.
>>>> Sorry for taking so long to answer, but I wasn't the person producing the
>>>> checksum, so I had to ask too. The command used to produce the checksum is:
>>>> $ md5 -r <filename>
>>>> FreeBSD release version is the same as yours. I've just tested the same
>>>> command with FreeBSD 6.2, and it only outputs one space too.
>>> Ah right, md5 -r produces the alternate format.
>>> I suppose we could support a single space,
>>> by trying to open("*abc") after trying to open ("abc").
>>> There is still an ambiguity if both files are present,
>>> though that is unlikely. I'll have a look.
>> Thinking more about this, there is a bit of a
>> security issue with mixing both formats.
>> Consider the case currently with a checksum in BSD format
>> of ' important_file' (with leading space).
>> b85d6fb9ef4260dcf1ce0a1b0bff80d3  important_file
>> Now an attacker does:
>> mv ' important_file' important_file
>> cp trojan ' important_file'
>> And since coreutils will check 'important_file' first,
>> then we'll not report any errors.
>> If coreutils supported both formats then this would be compounded.
>> I suppose we could mitigate that somewhat by detecting and
>> supporting only a single format per run, as done in the following diff.
>> Note we don't handle the case where the first or only
>> entry in a BSD format checksum file has a file name with a
>> leading ' ' or '*'.  We could support this and avoid detection
>> with an option to specify BSD format checksums, but
>> I don't think that's warranted. Note we could detect
>> in this situation too by retrying the open with the
>> leading char included, but that would introduce a security
>> issue. Consider the following standard format checksum file:
>> b85d6fb9ef4260dcf1ce0a1b0bff80d3  firewall_rules
>> Attackers could then do this undetected:
>> mv firewall_rules ' firewall_rules'
> Good point.
> I don't see a way to make GNU md5sum handle this automatically
> and safely.  However, that's not a big deal: it's easy to
> convert from one format to the other using sed or perl.
> I'd be inclined to mark this "wontfix" (because we cannot)
> or simply to close it.

Yes, it's debatable.

To summarize, the only caveat with my patch I think is that
it will give a false error for BSD format checksums
where the first entry has a file name starting with ' ' or '*'.
That should be exceedingly rare though, and is a lot better
than a false OK.
Also in this case even without the patch, we're susceptible
to the 'trojan' case above.

The workaround is easy as you suggest:

  sed 's/ /  /' files.md5 | md5sum -c

However that is not easily discoverable.
I'm 50:50, so I'll think a bit more.
Hmm I might just document in info that
the checksum utilities are compatible with
the BSD ones when processed like:

  md5 -r files... | sed 's/ /  /' > files.md5


