[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] linux-user: Improve strace output for read() and getcwd()
From: |
Aleksandar Markovic |
Subject: |
Re: [PATCH] linux-user: Improve strace output for read() and getcwd() |
Date: |
Mon, 25 Nov 2019 21:54:52 +0100 |
On Mon, Nov 25, 2019 at 9:40 PM Helge Deller <address@hidden> wrote:
>
> On 25.11.19 14:46, Aleksandar Markovic wrote:
> > On Sun, Nov 24, 2019 at 2:31 PM Aleksandar Markovic
> > <address@hidden> wrote:
> >>
> >>
> >>
> >> On Sunday, November 24, 2019, Helge Deller <address@hidden> wrote:
> >>>
> >>> On 24.11.19 13:10, Aleksandar Markovic wrote:
> >>>>
> >>>>
> >>>> On Sunday, November 24, 2019, Helge Deller <address@hidden
> >>>> <mailto:address@hidden>> wrote:
> >>>>
> >>>> On 23.11.19 12:34, Aleksandar Markovic wrote:
> >>>> > On Thursday, November 21, 2019, Helge Deller <address@hidden
> >>>> <mailto:address@hidden> <mailto:address@hidden <mailto:address@hidden>>>
> >>>> wrote:
> >>>> >
> >>>> > The strace functionality in qemu-user lacks the possibility to
> >>>> trace
> >>>> > which real values get returned to pointers in userspace by
> >>>> syscalls.
> >>>> >
> >>>> > For example, the read() and getcwd() syscalls currently only
> >>>> show the
> >>>> > destination address where the syscalls should put the return
> >>>> values:
> >>>> > 2532 read(3,0xff80038c,512) = 512
> >>>> > 2532 getcwd(0x18180,4096) = 9
> >>>> >
> >>>> > With the patch below, one now can specify in
> >>>> print_syscall_late() which
> >>>> > syscalls should be executed first, before they get printed.
> >>>> > After adding the read() and getcwd() syscalls, we now get this
> >>>> output in
> >>>> > with strace instead:
> >>>> > 1708
> >>>> read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)
> >>>> = 512
> >>>> > 1708 getcwd("/usr/bin",4096) = 9
> >>>> >
> >>>> > This patch adds just the framework with the respective
> >>>> implemenations for
> >>>> > read() and getcwd(). If applied, more functions can be added
> >>>> easily later.
> >>>> >
> >>>> >
> >>>> > Great out-of-the-box idea! However, there are some things that are
> >>>> > not thought over yet. There is a good reason why strace happens
> >>>> > "early": if something crash-like happens during syscall
> >>>> translation,
> >>>> > we still have trace of original target syscall. In case of "late"
> >>>> > flavor, we don't have anything. Another potentially problematic
> >>>> > aspect is that end user certainly should know whether the trace was
> >>>> > done "early" or "late" - otherwise, there will be, for certain,
> >>>> cases
> >>>> > of misinterpretation / misleading / confusion of end users.
> >>>>
> >>>> Thanks for the feedback - I'm shortly sending a v2 version which
> >>>> prints the syscall name with an open parenthesis, e.g. "read(" or
> >>>> "getcwd(",
> >>>> before calling the syscall. That way you can see where it
> >>>> crashed/hangs...
> >>>>
> >>>>
> >>>> ... but I cannot see passed arguments (for example, whether NULL is
> >>>> passed, or somrthing that looks like a real address)...
> >>>
> >>> Right.
> >>> And you won't see that in native strace either...
> >>> If you have an idea, just let me know.
> >>>
> >>
> >> Helge, I do have an idea how to amend your proposal that should give us,
> >> believe it or not, "best of both worlds", but let's wait few days, I don't
> >> have any dev setup ar hand at the moment to check it in action, and on top
> >> of that I am swamped with other unrelated tasks. In the meantime, maybe
> >> other guys will say something too, or perhaps you would have something new
> >> as well.
> >>
> >> Still, by no means, I see this as a (potential) *huge* improvement of QEMU
> >> strace. Thanks!
> >>
> >
> > Helge, Laurent,
> >
> > I don't have time to experiment, but I will write down here my
> > suggestion to extending Helge's system in a textual form, based on
> > example from prevous Helge's patches:
> >
> > 1. CURRENT STATE:
> >
> > early: read(3,0xff80038c,512)
> > late: = 512
> >
> > early: getcwd(0x18180,4096)
> > late: = 9
> >
> > 2. HELGE'S PROPOSAL 1:
> >
> > late:
> > read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)
> > = 512
> > late: getcwd("/usr/bin",4096) = 9
> >
> > 3. HELGE'S PROPOSAL 2:
> >
> > early: read
> > late:
> > (3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)
> > = 512
> > early: getcwd
> > late: ("/usr/bin",4096) = 9
>
> My v2 patch actually includes the opening parenthesis in the early part:
> early: read(
> late:
> 3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) =
> 512
>
> The open parenthesis should make it visible, that the call hasn't happened
> yet.
>
Helge, with due respect, you are mixed up here. The purpose of QEMU
strace is to trace *target* syscalls, not *host* syscalls. Target
syscall read() happened.
> > 4. NEW PROPOSAL :
> >
> > early: read(3,0xff80038c,512)
> > late: = 512
> > [(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)]
> > early: getcwd(0x18180,4096)
> > late: = 9 [("/usr/bin",4096)]
> >
> > In other words, we would print for (selected) system calls content of
> > the parameters both before and after actual ioctl
> > translation/execution.
>
> I don't like this.
> IMHO it just adds unnecessary noise and differs from what a native strace
> shows.
>
QEMU strace doesn't have to be the same as native strace. Actually,
they could be used even at the same time: QEMU provides info on target
syscall, strace on host syscall. Do not forget that not all syscalls
are implemeted as one-to-one-correspondence between target and host
syscalls - combination of QEMU strace and native strace is precious at
some moments.
In any case, I think that requiring QEMU strace to look exactly like
native strace is a self-imposed limitation, that does not have enough
justification. Think of it, wouldn't it be nice to surpass strace,
like in these cases above?
Yours,
Aleksandar
> Helge
>
>
> >>>> >
> >>>> > Signed-off-by: Helge Deller <address@hidden
> >>>> <mailto:address@hidden> <mailto:address@hidden <mailto:address@hidden>>>
> >>>> >
> >>>> > diff --git a/linux-user/strace.c b/linux-user/strace.c
> >>>> > index de43238fa4..ff254732af 100644
> >>>> > --- a/linux-user/strace.c
> >>>> > +++ b/linux-user/strace.c
> >>>> > @@ -61,6 +61,7 @@ UNUSED static void
> >>>> print_open_flags(abi_long, int);
> >>>> > UNUSED static void print_syscall_prologue(const struct
> >>>> syscallname *);
> >>>> > UNUSED static void print_syscall_epilogue(const struct
> >>>> syscallname *);
> >>>> > UNUSED static void print_string(abi_long, int);
> >>>> > +UNUSED static void print_encoded_string(abi_long addr,
> >>>> unsigned int maxlen, int last);
> >>>> > UNUSED static void print_buf(abi_long addr, abi_long len, int
> >>>> last);
> >>>> > UNUSED static void print_raw_param(const char *, abi_long,
> >>>> int);
> >>>> > UNUSED static void print_timeval(abi_ulong, int);
> >>>> > @@ -1204,6 +1205,37 @@ print_syscall_epilogue(const struct
> >>>> syscallname *sc)
> >>>> > gemu_log(")");
> >>>> > }
> >>>> >
> >>>> > +#define MAX_ENCODED_CHARS 32
> >>>> > +static void
> >>>> > +print_encoded_string(abi_long addr, unsigned int maxlen, int
> >>>> last)
> >>>> > +{
> >>>> > + unsigned int maxout;
> >>>> > + char *s, *str;
> >>>> > +
> >>>> > + s = lock_user_string(addr);
> >>>> > + if (s == NULL) {
> >>>> > + /* can't get string out of it, so print it as pointer
> >>>> */
> >>>> > + print_pointer(addr, last);
> >>>> > + return;
> >>>> > + }
> >>>> > +
> >>>> > + str = s;
> >>>> > + gemu_log("\"");
> >>>> > + maxout = MIN(maxlen, MAX_ENCODED_CHARS);
> >>>> > + while (maxout--) {
> >>>> > + unsigned char c = *str++;
> >>>> > + if (isprint(c)) {
> >>>> > + gemu_log("%c", c);
> >>>> > + } else {
> >>>> > + gemu_log("\\%o", (unsigned int) c);
> >>>> > + }
> >>>> > + }
> >>>> > + unlock_user(s, addr, 0);
> >>>> > +
> >>>> > + gemu_log("\"%s%s", maxlen > MAX_ENCODED_CHARS ? "..." :
> >>>> "",
> >>>> > + get_comma(last));
> >>>> > +}
> >>>> > +
> >>>> > static void
> >>>> > print_string(abi_long addr, int last)
> >>>> > {
> >>>> > @@ -1633,6 +1665,19 @@ print_futimesat(const struct
> >>>> syscallname *name,
> >>>> > }
> >>>> > #endif
> >>>> >
> >>>> > +#ifdef TARGET_NR_getcwd
> >>>> > +static void
> >>>> > +print_getcwd(const struct syscallname *name,
> >>>> > + abi_long arg0, abi_long arg1, abi_long arg2,
> >>>> > + abi_long arg3, abi_long arg4, abi_long arg5)
> >>>> > +{
> >>>> > + print_syscall_prologue(name);
> >>>> > + print_string(arg0, 0);
> >>>> > + print_raw_param("%u", arg1, 1);
> >>>> > + print_syscall_epilogue(name);
> >>>> > +}
> >>>> > +#endif
> >>>> > +
> >>>> > #ifdef TARGET_NR_settimeofday
> >>>> > static void
> >>>> > print_settimeofday(const struct syscallname *name,
> >>>> > @@ -2428,6 +2473,20 @@ print_fstatat64(const struct
> >>>> syscallname *name,
> >>>> > #define print_newfstatat print_fstatat64
> >>>> > #endif
> >>>> >
> >>>> > +#ifdef TARGET_NR_read
> >>>> > +static void
> >>>> > +print_read(const struct syscallname *name,
> >>>> > + abi_long arg0, abi_long arg1, abi_long arg2,
> >>>> > + abi_long arg3, abi_long arg4, abi_long arg5)
> >>>> > +{
> >>>> > + print_syscall_prologue(name);
> >>>> > + print_raw_param("%d", arg0, 0);
> >>>> > + print_encoded_string(arg1, arg2, 0);
> >>>> > + print_raw_param("%u", arg2, 1);
> >>>> > + print_syscall_epilogue(name);
> >>>> > +}
> >>>> > +#endif
> >>>> > +
> >>>> > #ifdef TARGET_NR_readlink
> >>>> > static void
> >>>> > print_readlink(const struct syscallname *name,
> >>>> > diff --git a/linux-user/strace.list b/linux-user/strace.list
> >>>> > index d49a1e92a8..220b1f4c46 100644
> >>>> > --- a/linux-user/strace.list
> >>>> > +++ b/linux-user/strace.list
> >>>> > @@ -272,7 +272,7 @@
> >>>> > { TARGET_NR_getcpu, "getcpu" , "%s(%p,%d)", NULL, NULL },
> >>>> > #endif
> >>>> > #ifdef TARGET_NR_getcwd
> >>>> > -{ TARGET_NR_getcwd, "getcwd" , "%s(%p,%d)", NULL, NULL },
> >>>> > +{ TARGET_NR_getcwd, "getcwd" , NULL, print_getcwd, NULL },
> >>>> > #endif
> >>>> > #ifdef TARGET_NR_getdents
> >>>> > { TARGET_NR_getdents, "getdents" , NULL, NULL, NULL },
> >>>> > @@ -1080,7 +1080,7 @@
> >>>> > { TARGET_NR_quotactl, "quotactl" , NULL, NULL, NULL },
> >>>> > #endif
> >>>> > #ifdef TARGET_NR_read
> >>>> > -{ TARGET_NR_read, "read" , "%s(%d,%#x,%d)", NULL, NULL },
> >>>> > +{ TARGET_NR_read, "read" , NULL, print_read, NULL },
> >>>> > #endif
> >>>> > #ifdef TARGET_NR_readahead
> >>>> > { TARGET_NR_readahead, "readahead" , NULL, NULL, NULL },
> >>>> > diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> >>>> > index ce399a55f0..c0079ca2b7 100644
> >>>> > --- a/linux-user/syscall.c
> >>>> > +++ b/linux-user/syscall.c
> >>>> > @@ -12069,6 +12069,21 @@ static abi_long do_syscall1(void
> >>>> *cpu_env, int num, abi_long arg1,
> >>>> > return ret;
> >>>> > }
> >>>> >
> >>>> > +/*
> >>>> > + * True if this syscall should be printed after having called
> >>>> the native
> >>>> > + * syscall, so that values which are fed back to userspace
> >>>> gets printed.
> >>>> > + */
> >>>> > +static int print_syscall_late(int syscall)
> >>>> > +{
> >>>> > + switch (syscall) {
> >>>> > + case TARGET_NR_getcwd:
> >>>> > + case TARGET_NR_read:
> >>>> > + return 1;
> >>>> > + default:
> >>>> > + return 0;
> >>>> > + }
> >>>> > +}
> >>>> > +
> >>>> > abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
> >>>> > abi_long arg2, abi_long arg3, abi_long
> >>>> arg4,
> >>>> > abi_long arg5, abi_long arg6, abi_long
> >>>> arg7,
> >>>> > @@ -12095,9 +12110,16 @@ abi_long do_syscall(void *cpu_env,
> >>>> int num, abi_long arg1,
> >>>> > arg2, arg3, arg4, arg5, arg6, arg7,
> >>>> arg8);
> >>>> >
> >>>> > if (unlikely(do_strace)) {
> >>>> > - print_syscall(num, arg1, arg2, arg3, arg4, arg5,
> >>>> arg6);
> >>>> > + int late_printing;
> >>>> > + late_printing = print_syscall_late(num);
> >>>> > + if (!late_printing) {
> >>>> > + print_syscall(num, arg1, arg2, arg3, arg4, arg5,
> >>>> arg6);
> >>>> > + }
> >>>> > ret = do_syscall1(cpu_env, num, arg1, arg2, arg3,
> >>>> arg4,
> >>>> > arg5, arg6, arg7, arg8);
> >>>> > + if (late_printing) {
> >>>> > + print_syscall(num, arg1, arg2, arg3, arg4, arg5,
> >>>> arg6);
> >>>> > + }
> >>>> > print_syscall_ret(num, ret);
> >>>> > } else {
> >>>> > ret = do_syscall1(cpu_env, num, arg1, arg2, arg3,
> >>>> arg4,
> >>>> >
> >>>>
> >>>
>
- [PATCH] linux-user: Improve strace output for read() and getcwd(), Helge Deller, 2019/11/21
- Re: [PATCH] linux-user: Improve strace output for read() and getcwd(), no-reply, 2019/11/21
- Re: [PATCH] linux-user: Improve strace output for read() and getcwd(), Aleksandar Markovic, 2019/11/23
- Re: [PATCH] linux-user: Improve strace output for read() and getcwd(), Helge Deller, 2019/11/24
- Re: [PATCH] linux-user: Improve strace output for read() and getcwd(), Aleksandar Markovic, 2019/11/24
- Re: [PATCH] linux-user: Improve strace output for read() and getcwd(), Helge Deller, 2019/11/24
- Re: [PATCH] linux-user: Improve strace output for read() and getcwd(), Aleksandar Markovic, 2019/11/24
- Re: [PATCH] linux-user: Improve strace output for read() and getcwd(), Aleksandar Markovic, 2019/11/25
- Re: [PATCH] linux-user: Improve strace output for read() and getcwd(), Helge Deller, 2019/11/25
- Re: [PATCH] linux-user: Improve strace output for read() and getcwd(),
Aleksandar Markovic <=
- Re: [PATCH] linux-user: Improve strace output for read() and getcwd(), Laurent Vivier, 2019/11/26