Re: Problem of GDB interaction with interrupted system calls

bug-gdb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problem of GDB interaction with interrupted system calls

From:	Alexandre Rusev
Subject:	Re: Problem of GDB interaction with interrupted system calls
Date:	Fri, 13 Nov 2009 17:42:48 +0300
User-agent:	Thunderbird 1.5.0.8 (X11/20060911)

teawater wrote:

I think this is a hehavior of kernel.  I think change pc always a
danger thing.  :)

Yes, extremely dangeorous! ;)
But GDB supports feature such as "call <func_name>", when using it the Joe user does not even cares the PC,
he just thinks that he makes call to some function...

Moreover the intent of changing stack by kernel is to make system call to restart.
The kernel (in theory) could choose to not return to userland at this point (because of no signal handlers are set by the process)
and restart syscall internally.
If it where so, all the process could have been transparent for GDB.

And this usecase is enounted quite often by users....:(

Because the nature of the problem is quite clear, it could be (once "in theory") worked around both in kernel and in GDB.

Because the GDB does a lot of tricks and serves to thecnical puroses may be it it the best place to implement workaround there?

Yet from point of view of kernel that's the place where either nobody (even GDB) is supposed to intervien the kernel internal
housekeeping, at least till the next machine instruction is executed.
So the kernel could either:
[1] not enable the GDB/ptrace to stop process and change user registers at that point
[2] remember the state of essential registers (PC, may be others likes FP) and revert all changes before executing the next instruction
[3] remember the state of essential registers (PC, may be others likes FP) and if the process was stopped
and somebody (GDB/ptrace) changed PC before execution of the next instruction then kernel can avoid it's own changes of PC

Anyway the problem exists and I'm trying at least to find out where needs to be fixed


infrun: clear_proceed_status_thread (process 4542)
infrun: proceed (addr=0xffffffff, signal=144, step=0)
infrun: resume (step=0, signal=0), trap_expected=0
infrun: wait_for_inferior (treat_exec_as_sigtrap=0)
infrun: target_wait (-1, status) =
infrun:   4542 [process 4542],
infrun:   status->kind = stopped, signal = SIGINT
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0xb7fe3410
infrun: random signal 2

Program received signal SIGINT, Interrupt.
infrun: stop_stepping
0xb7fe3410 in __kernel_vsyscall ()
(gdb) disas
Dump of assembler code for function __kernel_vsyscall:
   0xb7fe3400 <+0>:	push   %ecx
   0xb7fe3401 <+1>:	push   %edx
   0xb7fe3402 <+2>:	push   %ebp
   0xb7fe3403 <+3>:	mov    %esp,%ebp
   0xb7fe3405 <+5>:	sysenter
   0xb7fe3407 <+7>:	nop
   0xb7fe3408 <+8>:	nop
   0xb7fe3409 <+9>:	nop
   0xb7fe340a <+10>:	nop
   0xb7fe340b <+11>:	nop
   0xb7fe340c <+12>:	nop
   0xb7fe340d <+13>:	nop
   0xb7fe340e <+14>:	jmp    0xb7fe3403 <__kernel_vsyscall+3>
=> 0xb7fe3410 <+16>:	pop    %ebp
   0xb7fe3411 <+17>:	pop    %edx
   0xb7fe3412 <+18>:	pop    %ecx
   0xb7fe3413 <+19>:	ret
End of assembler dump.
(gdb) p $pc
$1 = (void (*)()) 0xb7fe3410 <__kernel_vsyscall+16>
(gdb) p $pc=0xb7fe3413
$2 = (void (*)()) 0xb7fe3413 <__kernel_vsyscall+19>
(gdb) si
infrun: clear_proceed_status_thread (process 4542)
infrun: proceed (addr=0xffffffff, signal=144, step=1)
infrun: resume (step=1, signal=0), trap_expected=0
infrun: wait_for_inferior (treat_exec_as_sigtrap=0)
infrun: target_wait (-1, status) =
infrun:   4542 [process 4542],
infrun:   status->kind = stopped, signal = SIGTRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0xb7fe3412
infrun: stepi/nexti
infrun: stop_stepping
0xb7fe3412 in __kernel_vsyscall ()

Thanks,
Hui



On Nov 2, 7:27 pm, Alexandre Rusev <address@hidden> wrote:

teawater wrote:

This signal ctrl-c will not really send to inferior.

But the result is interrupted system call which is restarted then by kernel.
And is user changes program counter in GDB at this point,
then it takes place before the modification of PC is done by kernel.
The result is that execution jumps neither to the point the user said in
GDB nor
to the point the kernel wishes it be due to restart the syscall.

Is it incorrect behavior of GDB or incorrect behavior of kernel or
something else???

(gdb) help info handle

On Oct 31, 12:10 am, Alexandre Rusev <address@hidden> wrote:

Hi.

When the program at ht end of message debugged under GDB is stopped with
Ctrl+C
it's usually found in interrupted system call. (The same result is
observed for x86 and PPC with kernels 2.6.18 and 2.6.28)

(gdb) where
#0  0xb7fe2424 in __kernel_vsyscall ()
#1  0xb7f36ad0 in nanosleep () from /lib/libc.so.6
#2  0xb7f3690e in sleep () from /lib/libc.so.6
#3  0x08048600 in qqq () at testBT2.c:45
#4  0x080487a5 in eee () at testBT2.c:73
#5  0x0804846a in main () at testBT2.c:17

The PC is pointing at the next instruction, accordingly to GDB.
But the kernel tries to restart syscall by means of changing PC to PC-4
(in case of PowerPC and to some other value for x86)
and it does it's change to PC after the user continues execution of
program in GDB with "cont" or "si" command.

The issue is that if user changed PC at this point or uses "call
<func_name>" GDB command, the both changes to PC
are added (as kernel uses PC relative change i.e. PC - 4), and the
program continues execution at absolutely wrong place.

The issue may be gracefully observed if breakpoint is set just before
<func_name> and then PC is changed to the <func_name> address.
In such case the breakpoint is hit while it must not be.

#include <stdio.h>
#include <stdlib.h>

void qqq();
void www();
void eee();

void * xrealloc(void*, int);

int main(void)
{
        eee();
    return EXIT_SUCCESS;

void qqq() {
    void *a[256];
    size_t i, n;

    for (i = 0; i < 256; i++)
    {
        n = (size_t) ((rand() * 256.0) / (RAND_MAX + 1.0)) + 1;
        a[i] = malloc(n);
    }
    for (i = 256; i > 0; i--)
    {
        n = (size_t) ((rand() * 256.0) / (RAND_MAX + 1.0)) + 1;
        a[i - 1] = xrealloc(a[i - 1], n);
    }
    sleep(1);
    for (i = 0; i < 256; i += 2)
        free(a[i]);
    for (i = 256; i > 0; i -= 2)
        free(a[i - 1]);
    sleep(1);

void www() {
    void *a[256];
    size_t i, n;

    for (i = 0; i < 256; i++)
    {
        n = (size_t) ((rand() * 256.0) / (RAND_MAX + 1.0)) + 1;
        a[i] = malloc(n);
    }
    for (i = 256; i > 0; i--)
    {
        n = (size_t) ((rand() * 256.0) / (RAND_MAX + 1.0)) + 1;
        a[i - 1] = realloc(a[i - 1], n);
    }
    sleep(1);
    for (i = 0; i < 256; i += 2)
        free(a[i]);
    for (i = 256; i > 0; i -= 2)
        free(a[i - 1]);
    sleep(1);

void eee() {

        while (1) {
                qqq();

                www();

void * xrealloc(void* addr, int n) {
        return realloc(addr, n);

_______________________________________________
bug-gdb mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/bug-gdb

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Problem of GDB interaction with interrupted system calls, teawater, 2009/11/02
- Re: Problem of GDB interaction with interrupted system calls, Alexandre Rusev, 2009/11/02
- Message not available
  - Re: Problem of GDB interaction with interrupted system calls, teawater, 2009/11/13
    - Re: Problem of GDB interaction with interrupted system calls, Alexandre Rusev <=
    - Re: Problem of GDB interaction with interrupted system calls, Hui Zhu, 2009/11/16

Prev by Date: Re: Problem of GDB interaction with interrupted system calls
Next by Date: GDB debugging for a big c language project.
Previous by thread: Re: Problem of GDB interaction with interrupted system calls
Next by thread: Re: Problem of GDB interaction with interrupted system calls
Index(es):
- Date
- Thread