[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Libunwind-devel] Time Cost of Libunwind is too expensive.

From: Chenggang
Subject: Re: [Libunwind-devel] Time Cost of Libunwind is too expensive.
Date: Wed, 26 Nov 2014 21:47:29 +0800 (CST)

   Thanks for your advise. I will study the fast trace this weekend. Hope it can help me.


At 2014-11-26 17:21:59, "Lassi Tuura" <address@hidden> wrote:

So that sounds like you are doing out-of-process stack walking. I have only ever measured in-process, so I am not sure I can usefully help you. In theory x86-64 fast trace could be used to speed up out-of-process stack walking, but that's just a theory - I don't have any data on where the time is spent. You might want to profile your profiler with another tool to get an idea where the time goes.

The patches I mentioned added unw_backtrace() API, which for in-process tracing should automatically use fast trace if it can. If you have a self-coded loop using unw_step(), it would be slower. Maybe try using the fast trace optimisations for out-of-process tracing too?


On Wed, Nov 26, 2014 at 5:42 AM, Chenggang <address@hidden> wrote:
Hi, Lassi:
    The unwind cost is a major problem that bothers me now. I explain in detail about my system.
    I write a profiling service in a cloud system. This system will sampling the all CPUs in a machine, and the cloud system will have thousands machines. The sampling frequency is 10 Hz. All OS are rhel5u7, the x86_64 version. The libunwind is the latest version (I got it use git). 
    All programs are wrote by C/C++, compiler is GCC/G++, the version is 4.1.2.
    I got the sampling information like perf. The stack, registers are saved in kernel, then use the external walking to unwind the stack. The APIs I used is like:


static unw_accessors_t accessors = {
    .find_proc_info     = find_proc_info,
    .put_unwind_info    = put_unwind_info,
    .get_dyn_info_list_addr = get_dyn_info_list_addr,
    .access_mem     = access_mem,
    .access_reg     = access_reg,
    .access_fpreg       = access_fpreg,
    .resume         = resume,
    .get_proc_name      = get_proc_name,

    It must be a slow method.
    I read the email:
    It look like a faster method. Does it use different APIs?

Regards Chenggang

At 2014-11-22 21:16:11, "Lassi Tuura" <address@hidden> wrote:
That doesn't sound normal to me, but how exactly are you doing the walking? What operating system are you using, is it 32- or 64-bit, which library version, how did you build it, are you using external (ptrace) or in-process (UNW_LOCAL_ONLY) walking, what exact API are you calling to walk, what language and compiler did you use for your program, etc.?

Here are some reference numbers from another profiling tool (igprof) using libunwind a few years back:

The time in clock cycles to walk on average 30-ish stack frames, for very frequent walks (3M/sec) was in the ballpark of 2500, and 70000 for less frequent setitimer interrupts at 200/sec (~5 ms interrupt).

On Sat, Nov 22, 2014 at 10:05 AM, Chenggang <address@hidden> wrote:
     I am a user of libunwind. I am developing a profiling system, "Bianque".
     I use libunwind to unwind the stack on the target machine. But the time cost is too expensive.
     While the layers of call chain is 130 and the stack size is 1MB, we need 3.8 milliseconds to unwind it.
     My CPU is Xeon(R) CPU E5-2430 0 @ 2.20GHz.
     Is this cost normal?


Libunwind-devel mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]