[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`
From: |
Amjad Alsharafi |
Subject: |
Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset` |
Date: |
Wed, 12 Jun 2024 00:22:46 +0800 |
On Tue, Jun 11, 2024 at 04:30:53PM +0200, Kevin Wolf wrote:
> Am 11.06.2024 um 14:31 hat Amjad Alsharafi geschrieben:
> > On Mon, Jun 10, 2024 at 06:49:43PM +0200, Kevin Wolf wrote:
> > > Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> > > > The field is marked as "the offset in the file (in clusters)", but it
> > > > was being used like this
> > > > `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
> > > >
> > > > Additionally, removed the `abort` when `first_mapping_index` does not
> > > > match, as this matches the case when adding new clusters for files, and
> > > > its inevitable that we reach this condition when doing that if the
> > > > clusters are not after one another, so there is no reason to `abort`
> > > > here, execution continues and the new clusters are written to disk
> > > > correctly.
> > > >
> > > > Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com>
> > >
> > > Can you help me understand how first_mapping_index really works?
> > >
> > > It seems to me that you get a chain of mappings for each file on the FAT
> > > filesystem, which are just the contiguous areas in it, and
> > > first_mapping_index refers to the mapping at the start of the file. But
> > > for much of the time, it actually doesn't seem to be set at all, so you
> > > have mapping->first_mapping_index == -1. Do you understand the rules
> > > around when it's set and when it isn't?
> >
> > Yeah. So `first_mapping_index` is the index of the first mapping, each
> > mapping is a group of clusters that are contiguous in the file.
> > Its mostly `-1` because the first mapping will have the value set as
> > `-1` and not its own index, this value will only be set when the file
> > contain more than one mapping, and this will only happen when you add
> > clusters to a file that are not contiguous with the existing clusters.
>
> Ah, that makes some sense. Not sure if it's optimal, but it's a rule I
> can work with. So just to confirm, this is the invariant that we think
> should always hold true, right?
>
> assert((mapping->mode & MODE_DIRECTORY) ||
> !mapping->info.file.offset ||
> mapping->first_mapping_index > 0);
>
Yes.
We can add this into `get_cluster_count_for_direntry` loop.
I'm thinking of also converting those `abort` into `assert`, since
the line `copy_it = 1;` was confusing me, since it was after the `abort`.
> > And actually, thanks to that I noticed another bug not fixed in PATCH 3,
> > We are doing this check
> > `s->current_mapping->first_mapping_index != mapping->first_mapping_index`
> > to know if we should switch to the new mapping or not.
> > If we were reading from the first mapping (`first_mapping_index == -1`)
> > and we jumped to the second mapping (`first_mapping_index == n`), we
> > will catch this condition and switch to the new mapping.
> >
> > But if the file has more than 2 mappings, and we jumped to the 3rd
> > mapping, we will not catch this since (`first_mapping_index == n`) for
> > both of them haha. I think a better check is to check the `mapping`
> > pointer directly. (I'll add it also in the next series together with a
> > test for it.)
>
> This comparison is exactly what confused me. I didn't realise that the
> first mapping in the chain has a different value here, so I thought this
> must mean that we're looking at a different file now - but of course I
> couldn't see a reason for that because we're iterating through a single
> file in this function.
>
> But even now that I know that the condition triggers when switching from
> the first to the second mapping, it doesn't make sense to me. We don't
> have to copy things around just because a file is non-contiguous.
>
> What we want to catch is if the order of mappings has changed compared
> to the old state. Do we need a linked list, maybe a prev_mapping_index,
> instead of first_mapping_index so that we can compare if it is still the
> same as before?
I think this would be the better design (tbh, that's what I thought
`first_mapping_index` would do), though not sure if other components
depend so much into the current design that it would be hard to change.
I'll try to implement this `prev_mapping_index` and see how it goes.
>
> Or actually, I suppose that's the first block with an abort() in the
> code, just that it doesn't compare mappings, but their offsets.
I think, I'm still confused on the whole logic there, the function
`get_cluster_count_for_direntry` is a mess, and it doesn't just
*get* the cluster count, it also schedule writeouts and may
copy clusters around.
>
> > >
> > > > block/vvfat.c | 12 +++++++-----
> > > > 1 file changed, 7 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/block/vvfat.c b/block/vvfat.c
> > > > index 19da009a5b..f0642ac3e4 100644
> > > > --- a/block/vvfat.c
> > > > +++ b/block/vvfat.c
> > > > @@ -1408,7 +1408,9 @@ read_cluster_directory:
> > > >
> > > > assert(s->current_fd);
> > > >
> > > > -
> > > > offset=s->cluster_size*(cluster_num-s->current_mapping->begin)+s->current_mapping->info.file.offset;
> > > > + offset = s->cluster_size *
> > > > + ((cluster_num - s->current_mapping->begin)
> > > > + + s->current_mapping->info.file.offset);
> > > > if(lseek(s->current_fd, offset, SEEK_SET)!=offset)
> > > > return -3;
> > > > s->cluster=s->cluster_buffer;
> > > > @@ -1929,8 +1931,9 @@ get_cluster_count_for_direntry(BDRVVVFATState* s,
> > > > direntry_t* direntry, const ch
> > > > (mapping->mode & MODE_DIRECTORY) == 0) {
> > > >
> > > > /* was modified in qcow */
> > > > - if (offset != mapping->info.file.offset +
> > > > s->cluster_size
> > > > - * (cluster_num - mapping->begin)) {
> > > > + if (offset != s->cluster_size
> > > > + * ((cluster_num - mapping->begin)
> > > > + + mapping->info.file.offset)) {
> > > > /* offset of this cluster in file chain has
> > > > changed */
> > > > abort();
> > > > copy_it = 1;
> > > > @@ -1944,7 +1947,6 @@ get_cluster_count_for_direntry(BDRVVVFATState* s,
> > > > direntry_t* direntry, const ch
> > > >
> > > > if (mapping->first_mapping_index !=
> > > > first_mapping_index
> > > > && mapping->info.file.offset > 0) {
> > > > - abort();
> > > > copy_it = 1;
> > > > }
> > >
> > > I'm unsure which case this represents. If first_mapping_index refers to
> > > the mapping of the first cluster in the file, does this mean we got a
> > > mapping for a different file here? Or is the comparison between -1 and a
> > > real value?
> >
> > Now that I think more about it, I think this `abort` is actually
> > correct, the issue though is that the handling around this code is not.
> >
> > What this `abort` actually does is that it checks.
> > - if the `mapping->first_mapping_index` is not the same as
> > `first_mapping_index`, which **should** happen only in one case, when
> > we are handling the first mapping, in that case
> > `mapping->first_mapping_index == -1`, in all other cases, the other
> > mappings after the first should have the condition true.
> > - From above, we know that this is the first mapping, so if the offset
> > is not `0`, then abort, since this is an invalid state.
>
> Yes, make sense.
>
> > This is all good, the issue is that `first_mapping_index` is not set if
> > we are checking from the middle, the variable `first_mapping_index` is
> > only set if we passed through the check `cluster_was_modified` with the
> > first mapping, and in the same function call we checked the other
> > mappings.
>
> I think I noticed the same yesterday, but when I tried to write a quick
> patch that I could show you and that would update first_mapping_index in
> each iteration, I broke something. So I decided I'd first ask you what
> all of this even means. :-)
>
> > From what I have seen, that doesn't happen since even if you write the
> > whole file in one go, you are still writing it cluster by cluster, and
> > the checks happen at that time.
>
> Well, we do trigger the condition, but I suppose updating
> first_mapping_index in each loop iteration is really the way to go if
> you think the same.
Indeed, I did a quick change, modifying the loop to always go through
and set the `first_mapping_index` for the first mapping fixes the issue
and we can put the `abort` back in place.
I'll also modify the check to instead be
`mapping->first_mapping_index < 0 && mapping->info.file.offset > 0`
This will make it clear that this applies only to the first mapping.
>
> Kevin
>
[PATCH v4 1/4] vvfat: Fix bug in writing to middle of file, Amjad Alsharafi, 2024/06/04
[PATCH v4 3/4] vvfat: Fix reading files with non-continuous clusters, Amjad Alsharafi, 2024/06/04
[PATCH v4 4/4] iotests: Add `vvfat` tests, Amjad Alsharafi, 2024/06/04