qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] qemu log function to print out the registers of the gue


From: Steven
Subject: Re: [Qemu-devel] qemu log function to print out the registers of the guest
Date: Mon, 27 Aug 2012 12:15:56 -0400

On Sat, Aug 25, 2012 at 4:41 PM, Max Filippov <address@hidden> wrote:
> On Sat, Aug 25, 2012 at 9:20 PM, Steven <address@hidden> wrote:
>> On Tue, Aug 21, 2012 at 3:18 AM, Max Filippov <address@hidden> wrote:
>>> On Tue, Aug 21, 2012 at 9:40 AM, Steven <address@hidden> wrote:
>>>> Hi, Max,
>>>> I wrote a small program to verify your patch could catch all the load
>>>> instructions from the guest. However, I found some problem from the
>>>> results.
>>>>
>>>> The guest OS and the emulated machine are both 32bit x86. My simple
>>>> program in the guest declares an 1048576-element integer array,
>>>> initialize the elements, and load them in a loop. It looks like this
>>>>           int array[1048576];
>>>>           initialize the array;
>>>>
>>>>           /*  region of interests */
>>>>           int temp;
>>>>           for (i=0; i < 1048576; i++) {
>>>>               temp = array[i];
>>>>           }
>>>> So ideally, the path should catch the guest virtual address of in the
>>>> loop, right?
>>>>           In addition, the virtual address for the beginning and end
>>>> of the array is 0xbf68b6e0 and 0xbfa8b6e0.
>>>>           What i got is as follows
>>>>
>>>>           __ldl_mmu, vaddr=bf68b6e0
>>>>           __ldl_mmu, vaddr=bf68b6e4
>>>>           __ldl_mmu, vaddr=bf68b6e8
>>>>           .....
>>>>           These should be the virtual address of the above loop. The
>>>> results look good because the gap between each vaddr is 4 bypte, which
>>>> is the length of each element.
>>>>           However, after certain address, I got
>>>>
>>>>           __ldl_mmu, vaddr=bf68bffc
>>>>           __ldl_mmu, vaddr=bf68c000
>>>>           __ldl_mmu, vaddr=bf68d000
>>>>           __ldl_mmu, vaddr=bf68e000
>>>>           __ldl_mmu, vaddr=bf68f000
>>>>           __ldl_mmu, vaddr=bf690000
>>>>           __ldl_mmu, vaddr=bf691000
>>>>           __ldl_mmu, vaddr=bf692000
>>>>           __ldl_mmu, vaddr=bf693000
>>>>           __ldl_mmu, vaddr=bf694000
>>>>           ...
>>>>           __ldl_mmu, vaddr=bf727000
>>>>           __ldl_mmu, vaddr=bf728000
>>>>           __ldl_mmu, vaddr=bfa89000
>>>>           __ldl_mmu, vaddr=bfa8a000
>>>> So the rest of the vaddr I got has a different of 4096 bytes, instead
>>>> of 4. I repeated the experiment for several times and got the same
>>>> results. Is there anything wrong? or could you explain this? Thanks.
>>>
>>> I see two possibilities here:
>>> - maybe there are more fast path shortcuts in the QEMU code?
>>>   in that case output of qemu -d op,out_asm would help.
>>> - maybe your compiler had optimized that sample code?
>>>   could you try to declare array in your sample as 'volatile int'?
>> After adding the "volatile" qualifier, the results are correct now.
>> So your patch can trap all the guest memory data load access, no
>> matter slow path or fast path.
>>
>> However, I found some problem when I try understanding the instruction
>> access. So I run the VM with "-d in_asm" to see program counter of
>> each guest code. I got
>>
>> __ldl_cmmu,ffffffff8102ff91
>> __ldl_cmmu,ffffffff8102ff9a
>> ----------------
>> IN:
>> 0xffffffff8102ff8a:  mov    0x8(%rbx),%rax
>> 0xffffffff8102ff8e:  add    0x790(%rbx),%rax
>> 0xffffffff8102ff95:  xor    %edx,%edx
>> 0xffffffff8102ff97:  mov    0x858(%rbx),%rcx
>> 0xffffffff8102ff9e:  cmp    %rcx,%rax
>> 0xffffffff8102ffa1:  je     0xffffffff8102ffb0
>> .....
>>
>> __ldl_cmmu,00000000004005a1
>> __ldl_cmmu,00000000004005a6
>> ----------------
>> IN:
>> 0x0000000000400594:  push   %rbp
>> 0x0000000000400595:  mov    %rsp,%rbp
>> 0x0000000000400598:  sub    $0x20,%rsp
>> 0x000000000040059c:  mov    %rdi,-0x18(%rbp)
>> 0x00000000004005a0:  mov    $0x1,%edi
>> 0x00000000004005a5:  callq  0x4004a0
>>
>> From the results, I see that the guest virtual address of the pc is
>> slightly different between the __ldl_cmmu and the tb's pc(below IN:).
>> Could you help to understand this? Which one is the true pc memory
>> access? Thanks.
>
> Guest code is accessed at the translation time by C functions and
> I guess there are other layers of address translation caching. I wouldn't
> try to interpret these _cmmu printouts and would instead instrument
> [cpu_]ld{{u,s}{b,w},l,q}_code macros.
yes, you are right.
Some ldub_code in x86 guest does not call __ldq_cmmu when the tlb hits.
By the way, when I use your patch, I saw too many log event for the
kernel data _mmu, ie., the addrs is
0x7fff ffff ffff. There are too many such mmu event that the user mode
data can not be executed. So I have to  setup a condition like
     if (addr < 0x8000 0000 0000)
            fprintf(stderr, "%s: %08x\n", __func__, addr);
Then my simple array access program can be finished.
I am wondering whether you have met the similar problem or you have
any suggestion on this.
My final  goal is to obtain the memory access trace for a particular
process in the guest, so your patch really helps, except for too many
kernel _mmu events.

steven
>
> --
> Thanks.
> -- Max



reply via email to

[Prev in Thread] Current Thread [Next in Thread]