[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: SIGSEGV problem
From: |
Jan-Henrik Haukeland |
Subject: |
Re: SIGSEGV problem |
Date: |
Thu, 14 Aug 2003 02:11:50 +0200 |
User-agent: |
Gnus/5.1002 (Gnus v5.10.2) XEmacs/21.4 (Civil Service, linux) |
I ran a fast test with efence and managed to reproduce the SIGSEGV (it
may be more). SIGSEGV is thrown in process/common.c:connectchild()
from this line:
parent->children[parent->children_num - 1] = (struct myprocesstree *) child;
>From my gdb/efence session:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1024 (LWP 1269)]
0x0805b340 in connectchild (parent=0x41143fa0, child=0x41144740)
at process/common.c:232
(gdb) p *parent->children
Cannot access memory at address 0x41365fcc
(gdb) p parent->children[parent->children_num - 1]
Cannot access memory at address 0x41365ffc
I suspect it's caused by trying to access something outside the
array. Maybe Christian can debug this since it's his code :) I'm of to
bed, it's late.
BTW, Christian why do you use cast in this code!? I'm thinking about
stuff like (struct myprocesstree ..) It is *not* necessary.
Martin Pala <address@hidden> writes:
> Another SIGSEGV occures if monitored process has pidfile, but process
> with given pid doesn't exist. It fails in the same thread as described
> bellow, but later. See another trace.
>
> Martin
>
> Martin Pala wrote:
>
>> Hi,
>>
>> during tests occasionaly one of monit threads receives SIGSEGV. It
>> happens in the case, that the monitored process is not running.
>>
>> the path to the SIGSEGV is as follows:
>>
>> 1.) we're waiting for process to start in main thread after spawning
>> process start method:
>> ...
>> thread is status= pthread_create(&thread, NULL, wait_start, s);
>> ...
>>
>> 2.) in new thread (in wait_start) we detach and looking for process
>> to start or timeout:
>> ...
>> if(is_process_running(s))
>> break;
>> ...
>>
>> 3.) the thread crashed rigth after first call of
>> is_process_running(s) -> get_pid(s->path) -> exist_file(char *file)
>> -> stat(file, &buf) -
>> see strace output for complete trace:
>> ...
>> 3788 stat64("XE^^G^H/run/slapd.pid", <unfinished ...>
>> ...
>>
>>
>> As you can see, it seems that the pidfile path pointer points to
>> strange place.
>>
>> This problem happens very occasionaly (cca 5% of test attempts
>> failed on this error - others were OK).
>>
>> I tried to include some debug tags to trace it - something like
>> fprintf(stderr, "mark1"); etc., but as soon as i did it, i was not
>> able to replicate the problem at all. I tied it many times again
>> with and without these tags and the result was the same - with tags
>> it worked well, without tags it failed => probably there is some
>> race condition, maybe outside monit (in libs).
>>
>> Any ideas?
>>
>> Martin
>>
>>
>>
>>
>>------------------------------------------------------------------------
>>
>>_______________________________________________
>>monit-dev mailing list
>>address@hidden
>>http://mail.nongnu.org/mailman/listinfo/monit-dev
>>
>
>
> _______________________________________________
> monit-dev mailing list
> address@hidden
> http://mail.nongnu.org/mailman/listinfo/monit-dev
--
Jan-Henrik Haukeland
- SIGSEGV problem, Martin Pala, 2003/08/13
- Re: SIGSEGV problem, Martin Pala, 2003/08/13
- Re: SIGSEGV problem,
Jan-Henrik Haukeland <=
- Re: SIGSEGV problem, Christian Hopp, 2003/08/14
- Re: SIGSEGV problem, Jan-Henrik Haukeland, 2003/08/14
- Re: SIGSEGV problem, Martin Pala, 2003/08/15
- Re: SIGSEGV problem, Martin Pala, 2003/08/15
Re: SIGSEGV problem, Jan-Henrik Haukeland, 2003/08/13