qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs


From: Christian Schoenebeck
Subject: Re: [PATCH v4 04/16] hw/9pfs: Implement Windows specific xxxdir() APIs
Date: Tue, 07 Feb 2023 11:11:31 +0100

On Monday, February 6, 2023 6:37:16 AM CET Shi, Guohuai wrote:
[...]
> > I know, it's an n-square performance issue and what I already wrote in the
> > summary of the linked original suggestion [1] in v3 before, quote:
> > 
> >   + Relatively straight-forward to implement.
> > 
> >   + No (major) changes in 9pfs code base required.
> > 
> >   - Still n-square performance issue (neglectable to land Windows host
> > support
> >     IMO).
> > 
> >   o Consistency assured for "most" cases, except one: if hardlinks are
> >     inserted in between then it might fail
> 
> readdir() on Linux host may also return the deleted entries.
> And POSIX specification does not mention about the consistency issue.

POSIX explicitly specifies that 1. new and 2. deleted entries may or may not
appear in result and leaves that implementation specific. That was never our
concern.

And yes, POSIX does not explicitly discuss consistency concerning entries that
have neither been added or removed, but this expectation is implied. In
practice double entries are probably less of an issue, client might be able to
handle that without misbehaviour (haven't checked this at all yet), but if the
implementation would lead to chances that entries may *never* appear to
clients at all, even after refreshing periodically, I mean how could you work
with a file system like that?

> NTFS file id is the $MFT index id. It will keen unique until file is deleted.
> But the index id may be reuse if delete and re-create many files.
> 
> Saving file id instead of name will make consistency better, but may not 
> cover all status.
> Because read directory is not a "atomic" operation.

I don't see an issue with that, because these are entries that were either
added or removed, we don't care about them. And their file IDs would not
affect fetching the other directory entries that have not been touched in
between.

And we are also not questioning atomicity here, but consistency.

> > [1] https://lore.kernel.org/qemu-devel/2468168.SvRIHAoRfs@silver/
> > 
> > The idea was to use that just as a starting point to land Windows host 
> > support
> > ASAP, slower on large dirs compared to other solutions, yes, but with
> > guaranteed correct and deterministic behaviour. And then on the long run
> > we would of course replace that with a more performant solution.
> > 
> > I mean, this is really simple to implement, so I would at least test it. If 
> > it really
> > runs horribly slow we could still discuss faster solutions, which are 
> > however
> > all much more tricky.
> > 
> 
> I did a basic test on Windows host, here is the code:
> 
>     st = clock();
>     pDir = opendir_win32(TEST_DIR);
> 
>     if (pDir == NULL)
>         return -1;
>     
>     while ((pEnt = readdir_win32(pDir)) != NULL)
>     {
>         totals++;
>     }
>     closedir_win32(pDir);
>     ed = clock();
> 
>     printf("total = %d clocks = %d %d\n", totals, ed - st, CLOCKS_PER_SEC);
> 
> My local storage is SSD disk.
> 
> Run this test for 100, 1000, 10000 entries.
> For file name cache solution, the time cost is: 2, 9, 44 (in ms).
> For file id cache solution, the time cost: 3, 438, 4338 (in ms).
> I already used OpenFileById() to make it faster instead of CreateFile(). If I 
> use CreateFile, it need more than 80 seconds.
> 
> The performance looks like not good. 
> And actually, it would be worse in 9pfs.
> Because in current design, 9pfs  may seek forward and seek back several times 
> during reading directory, which may cause the performance worse.

Poor performance, yes, probably a bit worse than I would have expected.

So it is about choosing your poison (potential crash vs. poor performance).

I mean, I am not keen into suggesting any kind of bike shredding for you on
this issue, but if this is merged, then people expect it to behave reliably
and not allowing a guest to crash QEMU host process by simply creating a large
number of directory entries on guest.

I was also considering to make it a QEMU option, but OTOH, this is a temporary
situation and those options would be wiped once we have an oppropriate
solution a bit later.

I am open for suggestions. Could we probably just mark Windows host support as
experimental for now, is that even allowed by QEMU policies?

> > > So I think store all name entries would be better than store all file ID.
> > 
> > As already discussed, NTFS allows up to (2^32 - 1) = 4,294,967,295 entries 
> > per
> > directory. So caching only one directory (entirely) in RAM can already 
> > exceed
> > the available RAM, which would crash QEMU. Multiplied by an expected
> > amount of directory lookups by client and we even get into much higher
> > categories, even with much smaller individual directory sizes.
> > 
> 
> Windows file id structure is 24 bytes, which is not a small structure.
> If you think the performance is acceptable, I can rework this commit based on 
> file id.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]