qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Approaches for same-on-same linux-user execve?


From: Laurent Vivier
Subject: Re: Approaches for same-on-same linux-user execve?
Date: Thu, 7 Oct 2021 20:59:01 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

Le 07/10/2021 à 16:32, Alex Bennée a écrit :
> Hi,
> 
> I came across a use-case this week for ARM although this may be also
> applicable to architectures where QEMU's emulation is ahead of the
> hardware currently widely available - for example if you want to
> exercise SVE code on AArch64. When the linux-user architecture is not
> the same as the host architecture then binfmt_misc works perfectly fine.
> 
> However in the case you are running same-on-same you can't use
> binfmt_misc to redirect execution to using QEMU because any attempt to
> trap native binaries will cause your userspace to hang as binfmt_misc
> will be invoked to run the QEMU binary needed to run your application
> and a deadlock ensues.
> 
> There are some hacks you can apply at a local level like tweaking the
> elf header of the binaries you want to run under emulation and adjusting
> the binfmt_mask appropriately. This works but is messy and a faff to
> set-up.
> 
> An ideal setup would be would be for the kernel to catch a SIGILL from a
> failing user space program and then to re-launch the process using QEMU
> with the old processes maps and execution state so it could continue.
> However I suspect there are enough moving parts to make this very
> fragile (e.g. what happens to the results of library feature probing
> code). So two approaches I can think of are:
> 
> Trap execve in QEMU linux-user
> ------------------------------
> 
> We could add a flag to QEMU so at the point of execve it manually
> invokes the new process with QEMU, passing on the flag to persist this
> behaviour.

Another approach can be to use ptrace(PTRACE_SYSEMU) to catch syscalls.

We need a wrapper that loads the first target binary and fork, it attach a 
ptrace() process and
intercept the syscalls to emulate them as we do in usermode linux.

I was thinking to this solution for instance to execute big-endian program 
(like ppc64) on
little-endian system (ppc64le).

But I'm not sure it fits in what you need...


> 
> Add path mask to binfmt_misc
> ----------------------------
> 
> The other option would be to extend binfmt_misc to have a path mask so
> it only applies it's alternative execution scheme to binaries in a
> particular section of the file-system (or maybe some sort of pattern?).
> 
> Are there any other approaches you could take? Which do you think has
> the most merit?

I don't know if it can apply to what you want, but I wrote years ago a binfmt 
namespace that applies
binfmt configuration only on a container but I didn't finish the work (it seems 
there can be some
security issues in what I did):

https://lore.kernel.org/lkml/20191216091220.465626-2-laurent@vivier.eu/T/

Thanks,
Laurent



reply via email to

[Prev in Thread] Current Thread [Next in Thread]