[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation
From: |
Andreas Dilger |
Subject: |
Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation |
Date: |
Thu, 27 Dec 2018 17:23:28 -0700 |
On Dec 27, 2018, at 10:41 AM, Peter Maydell <address@hidden> wrote:
>
> On Thu, 27 Dec 2018 at 17:19, Florian Weimer <address@hidden> wrote:
>> We have a bit of an interesting problem with respect to the d_off
>> field in struct dirent.
>>
>> When running a 64-bit kernel on certain file systems, notably ext4,
>> this field uses the full 63 bits even for small directories (strace -v
>> output, wrapped here for readability):
>>
>> getdents(3, [
>> {d_ino=1494304, d_off=3901177228673045825, d_reclen=40,
>> d_name="authorized_keys", d_type=DT_REG},
>> {d_ino=1494277, d_off=7491915799041650922, d_reclen=24, d_name=".",
>> d_type=DT_DIR},
>> {d_ino=1314655, d_off=9223372036854775807, d_reclen=24, d_name="..",
>> d_type=DT_DIR}
>> ], 32768) = 88
>>
>> When running in 32-bit compat mode, this value is somehow truncated to
>> 31 bits, for both the getdents and the getdents64 (!) system call (at
>> least on i386).
>
> Yes -- look for hash2pos() and friends in fs/ext4/dir.c.
> The ext4 code in the kernel uses a 32 bit hash if (a) the kernel
> is 32 bit (b) this is a compat syscall (b) some other bit of
> the kernel asked it to via the FMODE_32BITHASH flag (currently only
> NFS does that I think).
>
> As you note, this causes breakage for userspace programs which
> need to implement an API/ABI with 32-bit offset but which only
> have access to the kernel's 64-bit offset API/ABI.
This is (IMHO) a bit of an oxymoron, isn't it? Applications using
the 64-bit API, but storing the value in a 32-bit field? The same
problem would exist for filesystems with 64-bit inodes or 64-bit
file offsets trying to store these values in 32-bit variables.
It might work most of the time, but it can also break randomly.
> I think the best fix for this would be for the kernel to either
> (a) consistently use a 32-bit hash or (b) to provide an API
> so that userspace can use the FMODE_32BITHASH flag the way
> that kernel-internal users already can.
It would be relatively straight forward to add a "32bitapi" mount
option to return a 32-bit directory hash to userspace for operations
on that mountpoint (ext4 doesn't have 64-bit inode numbers yet).
However, I can't think of an easy way to do this on a per-process
basis without just having it call the 32-bit API directly.
> I couldn't think of or find any existing way for userspace
> to get the right results here, which is why
> 32-bit-guest-on-64-bit-host QEMU doesn't work on these filesystems
> (depending on what exactly the guest's libc etc do).
>
>> the 32-bit getdents system call emulation in a 64-bit qemu-user
>> process would just silently truncate the d_off field as part of
>> the translation, not reporting an error.
>> [...]
>> This truncation has always been a bug; it breaks telldir/seekdir
>> at least in some cases.
>
> Yes; you can't fit a quart into a pint pot, so if the guest
> only handles 32-bit offsets then truncation is about all we
> can do. This works fine if offsets are offsets, assuming the
> directory isn't so enormous it would have broken the guest
> anyway. I'm not aware of any issues with this other than the
> oddball ext4 offsets-are-hashes situation -- could you expand
> on the telldir/seekdir issue? (I suppose we should probably
> make QEMU's syscall emulation layer return "no more entries"
> rather than entries with truncated hashes.)
For ext4 at least, you could just shift the high 32-bit part of
the 64-bit hash down into a 32-bit value in telldir(), and
shift it back up when seekdir() is called.
Cheers, Andreas
signature.asc
Description: Message signed with OpenPGP
- [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation, Florian Weimer, 2018/12/27
- Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation, Andy Lutomirski, 2018/12/27
- Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation, Peter Maydell, 2018/12/27
- Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation,
Andreas Dilger <=
- Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation, Peter Maydell, 2018/12/28
- Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation, Andreas Dilger, 2018/12/28
- Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation, Peter Maydell, 2018/12/28
- Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation, Matthew Wilcox, 2018/12/28
- Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation, Andy Lutomirski, 2018/12/29
- Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation, Peter Maydell, 2018/12/30
- Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation, Theodore Y. Ts'o, 2018/12/28
- Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation, Dominique Martinet, 2018/12/28
- Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation, Theodore Y. Ts'o, 2018/12/28
- Re: [Qemu-devel] [V9fs-developer] d_off field in struct dirent and 32-on-64 emulation, Dominique Martinet, 2018/12/28