bug-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cvs 1.11.5 pserver sig11 crash on FreeBSD 4.8-R


From: Scott Mitchell
Subject: Re: cvs 1.11.5 pserver sig11 crash on FreeBSD 4.8-R
Date: Wed, 13 Aug 2003 09:41:05 +0100
User-agent: Mutt/1.4.1i

On Tue, Jul 29, 2003 at 03:32:30AM -0700, Mark D. Baushke wrote:
> Scott Mitchell <address@hidden> writes:
> 
> > The two clients that I'm certain have triggered server-side core dumps
> > are 1.10 on a Win2K box and 1.11.1p1 on a Debian (2.4.18 kernel) box.
> > All I get is a core and a UID in the logs, so unless I actually see it
> > happen and grab the user right then to see what they sere doing, I've no
> > way of knowing which client machine was the culprit :-(
> > 
> > Thanks for the feedback.  I'll see if I can get one or both of those newer
> > versions compiled and try them out next week.
> 
> Well, it sounds like a fairly nasty situation. I wish you well in
> tracking down the problem.

OK, since then I have tried two things:

1. Ran memtest86 over a weekend on the server.  It didn't find anything.
   Now I realise that not finding an error doesn't mean there isn't one,
   but this coupled with the fact that I can run multiple buildworlds on
   this machine without any problems gives me some confidence that this
   isn't a hardware issue.

2. Compiled a stock cvs 1.11.6 (no FreeBSD extensions) and ran with that
   for a few days.  Unfortunately it started crashing in the same way as
   1.11.5-FreeBSD, so I'm back with 1.11.1p1-FreeBSD again.

I caught two core dumps from the 1.11.6 server before I removed it.  In
both cases the client was 1.11.1p1 on a Debian box.  The stack traces are
pretty similar to what I was seeing before (see attached gdb.log).  One
thing I didn't notice before, but which also appears in the earlier dumps,
is that cvs seems to be catching a sig 13 (SIGPIPE), then segfaulting in
the cleanup that results from this.  I guess the SIGPIPE is caused by the
connection to the cvs client closing down, unexpectedly or otherwise.
There could be a race or other timing issue here, since this only happens
sometimes.  Of course this is all speculation from looking at the stack
trace; I don't know enough about the cvs code to say anything for sure.

I don't really want to try running 1.12.1.1 on a production repository,
so I'm not sure what else to try on the software side.  We're going to
swap out the RAM on this server sometime in the next couple of weeks and
run memtest86, a few buildworlds and all the various versions of cvs I've
built again, to see if anything changes.

Anyone have any other suggestions?

        Scott

Attachment: gdb.log
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]