grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] fs/btrfs: Make extent item iteration to handle gaps


From: Michael Chang
Subject: Re: [PATCH] fs/btrfs: Make extent item iteration to handle gaps
Date: Thu, 28 Oct 2021 17:27:18 +0800
User-agent: Mutt/1.10.1 (2018-07-13)

On Thu, Oct 28, 2021 at 03:36:10PM +0800, The development of GNU GRUB wrote:
> Gentle ping?
> 
> Without this patch, the new mkfs.btrfs NO_HOLES feature would break any
> kernel/initramfs with hole in it.
> 
> And considering the modification is already small, I believe this patch is
> definitely worthy as a bug fix.

+1

This does worth more attention as booting would have been blocked.

> 
> Thanks,
> Qu
> 
> On 2021/10/16 09:40, Qu Wenruo wrote:
> > [BUG]
> > Grub btrfs implementation can't handle two very basic btrfs file
> > layouts:
> > 
> > 1. Mixed inline/regualr extents
> >     # mkfs.btrfs -f test.img
> >     # mount test.img /mnt/btrfs
> >     # xfs_io -f -c "pwrite 0 1k" -c "sync" -c "falloc 0 4k" \
> >            -c "pwrite 4k 4k" /mnt/btrfs/file
> >     # umount /mnt/btrfs
> >     # ./grub-fstest ./grub-fstest --debug=btrfs ~/test.img hex "/file"
> > 
> >     Such mixed inline/regular extents case is not recommended layout,
> >     but all existing tools and kernel can handle it without problem
> > 
> > 2. NO_HOLES feature
> >     # mkfs.btrfs -f test.img -O no_holes
> >     # mount test.img /mnt/btrfs
> >     # xfs_io -f -c "pwrite 0 4k" -c "pwrite 8k 4k" /mnt/btrfs/file
> >     # umount /mnt/btrfs
> >     # ./grub-fstest ./grub-fstest --debug=btrfs ~/test.img hex "/file"
> > 
> >     NO_HOLES feature is going to be the default mkfs feature in the incoming
> >     v5.15 release, and kernel has support for it since v4.0.
> > 
> > [CAUSE]
> > The way GRUB btrfs code iterates through file extents relies on no gap
> > between extents.
> > 
> > If any gap is hit, then grub btrfs will error out, without any proper
> > reason to help debug the bug.
> > 
> > This is a bad assumption, since a long long time ago btrfs has a new
> > feature called NO_HOLES to allow btrfs to skip the padding hole extent
> > to reduce metadata usage.
> > 
> > The NO_HOLES feature is already stable since kernel v4.0 and is going to
> > be the default mkfs feature in the incoming v5.15 btrfs-progs release.
> > 
> > [FIX]
> > When there is a extent gap, instead of error out, just try next item.
> > 
> > This is still not ideal, as kernel/progs/U-boot all do the iteration
> > item by item, not relying on the file offset continuity.
> > 
> > But it will be way more time consuming to correct the whole behavior
> > than starting from scratch to build a proper designed btrfs module for GRUB.
> > 
> > Signed-off-by: Qu Wenruo <wqu@suse.com>
> > ---
> >   grub-core/fs/btrfs.c | 33 ++++++++++++++++++++++++++++++---
> >   1 file changed, 30 insertions(+), 3 deletions(-)
> > 
> > diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> > index 63203034dfc6..4fbcbec7524a 100644
> > --- a/grub-core/fs/btrfs.c
> > +++ b/grub-core/fs/btrfs.c
> > @@ -1443,6 +1443,7 @@ grub_btrfs_extent_read (struct grub_btrfs_data *data,
> >         grub_size_t csize;
> >         grub_err_t err;
> >         grub_off_t extoff;
> > +      struct grub_btrfs_leaf_descriptor desc;
> >         if (!data->extent || data->extstart > pos || data->extino != ino
> >       || data->exttree != tree || data->extend <= pos)
> >     {
> > @@ -1455,7 +1456,7 @@ grub_btrfs_extent_read (struct grub_btrfs_data *data,
> >       key_in.type = GRUB_BTRFS_ITEM_TYPE_EXTENT_ITEM;
> >       key_in.offset = grub_cpu_to_le64 (pos);
> >       err = lower_bound (data, &key_in, &key_out, tree,
> > -                        &elemaddr, &elemsize, NULL, 0);
> > +                        &elemaddr, &elemsize, &desc, 0);
> >       if (err)
> >         return -1;
> >       if (key_out.object_id != ino
> > @@ -1494,10 +1495,36 @@ grub_btrfs_extent_read (struct grub_btrfs_data 
> > *data,
> >                     PRIxGRUB_UINT64_T "\n",
> >                     grub_le_to_cpu64 (key_out.offset),
> >                     grub_le_to_cpu64 (data->extent->size));
> > +     /*
> > +      * The way of extent item iteration is pretty bad, it completely
> > +      * requires all extents are contiguous, which is not ensured.
> > +      *
> > +      * Features like NO_HOLE and mixed inline/regular extents can cause
> > +      * gaps between file extent items.
> > +      *
> > +      * The correct way is to follow kernel/U-boot to iterate item by
> > +      * item, without any assumption on the file offset continuity.
> > +      *
> > +      * Here we just manually skip to next item and re-do the verification.
> > +      *
> > +      * TODO: Rework the whole extent item iteration code, if not the
> > +      * whole btrfs implementation.
> > +      */
> >       if (data->extend <= pos)
> >         {
> > -         grub_error (GRUB_ERR_BAD_FS, "extent not found");
> > -         return -1;
> > +         err = next(data, &desc, &elemaddr, &elemsize, &key_out);
> > +         if (err < 0)
> > +           return -1;
> > +         /* No next item for the inode, we hit the end */
> > +         if (err == 0 || key_out.object_id != ino ||
> > +             key_out.type != GRUB_BTRFS_ITEM_TYPE_EXTENT_ITEM)
> > +                 return pos - pos0;
> > +
> > +         csize = grub_le_to_cpu64(key_out.offset) - pos;
> > +         buf += csize;
> > +         pos += csize;
> > +         len -= csize;

Does it make sense to add some sort of range check here to safegard len
from being underflow ? My justfication is that csize value is read out
from disk so can be anything wild presumably.

Thanks,
Michael


> > +         continue;
> >         }
> >     }
> >         csize = data->extend - pos;
> > 
> 
> 
> _______________________________________________
> Grub-devel mailing list
> Grub-devel@gnu.org
> https://lists.gnu.org/mailman/listinfo/grub-devel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]