grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] udf: Fix regression which is preventing symlink access


From: Daniel Kiper
Subject: Re: [PATCH] udf: Fix regression which is preventing symlink access
Date: Wed, 15 Sep 2021 16:52:28 +0200
User-agent: NeoMutt/20170113 (1.7.2)

On Tue, Sep 14, 2021 at 06:19:03PM +0000, Glenn Washburn wrote:
> On Tue, 14 Sep 2021 16:27:55 +0200
> Daniel Kiper <dkiper@net-space.pl> wrote:
>
> > On Fri, Sep 10, 2021 at 04:03:23PM +0000, Glenn Washburn wrote:
> > > This code was broken by commit 3f05d693 ("malloc: Use overflow
> > > checking primitives where we do complex allocations"), which added
> > > overflow checking in many areas. The problem here is that the
> > > changes update the local variable sz, which was already in use and
> > > which was not updated before the change. So the code using sz was
> > > getting a different value of than it would have previously for the
> > > same UDF image. This causes the logic getting the destination of
> > > the symlink to not realize that its gotten the full destination,
> > > but keeps trying to read past the end of the destination. The bytes
> > > after the end are generally NULL padding bytes, but that's not a
> > > valid component type (ECMA-167 14.16.1.1). So grub_udf_read_symlink
> > > branches to error logic, returning NULL, instead of the symlink
> > > destination path.
> > >
> > > The result of this bug is that the UDF filesystem tests were
> > > failing in the symlink test with the grub-fstest error message:
> > >
> > >     grub-fstest: error: cannot open `(loop0)/sym': invalid symlink.
> > >
> > > This change stores the result of doubling sz in another local
> > > variable s, so as not to modify sz. Also remove unnecessary
> > > grub_add, which increased the output by 1 to account for a NULL
> > > byte. This isn't needed because an output buffer of size twice sz
> > > is already guaranteed to be more than enough to contain the path
> > > components converted to UTF-8. The worst case upper- bound for the
> > > needed output buffer size is (sz-4)*1.5, where 4 is the size
> >
> > I think 4 comes from ECMA-167 spec. Could you add a reference to it
> > here? The number of paragraph would be perfect...
>
> Its 14.16.1 basically in the same place as the reference earlier, which
> is why I didn't include it. But, yes, I can include it.

Yes, please.

> > > of a path component header and 1.5 is the maximum growth in bytes
> > > when converting from 2-byte unicode code-points to UTF-8 (from 2
> > > bytes to 3).
> >
> > Could you explain how did you come up with the 1.5 value? It would be
> > nice if you refer to a spec or something like that.
>
> There is no spec that I know of (but would be happy to know of one). Its
> something I've deduced based on my understanding of Unicode, UTF-8, and
> UTF-16. All unicode code points less than or equal to 2 bytes (code
> points <0x10000) can be represented in UTF-8 by a maximum of 3 bytes
> [1]. Longer UTF-16 encodings don't matter because those will be 4 bytes
> or longer. The maximum number of bytes for a UTF-8 encoding of a unicode

The [1] says: Since Errata DCN-5157, the range of code points was
expanded to all code points from Unicode 4.0 (or any newer or older
version), which includes Plane 1-16 characters such as Emoji.

So, I think your assumption about longer encodings is incorrect for the UDF.

> code point is 4 bytes. So the longer UTF-16 encodings can only be equal
> to or longer than the UTF-8 encoding, thus the UTF-16 -> UTF-8 would be
> shrinking or maintaining the length of the original byte string. Since
> the worst case growth is 2 bytes to 3, that's 1.5 times the original
> string size. QED.
>
> Do you want all that in there? I could just remove that part about the
> 1.5 too.
>
> Here's an SO question that addresses this. Yes, unofficial, but I think
> adds some weight as to the correctness of the logic above.
>
> https://stackoverflow.com/questions/55056322/maximum-utf-8-string-size-given-utf-16-size
>
> Glenn
>
> [1] https://en.wikipedia.org/wiki/UTF-8#Encoding

Daniel

[1] https://en.wikipedia.org/wiki/Universal_Disk_Format#Character_set



reply via email to

[Prev in Thread] Current Thread [Next in Thread]