[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

New rtag-created branch: first client action causes 'protocol er ror'

From: Vanderputten_Jennifer
Subject: New rtag-created branch: first client action causes 'protocol er ror'
Date: Wed, 15 Aug 2001 14:40:56 -0400

I posted the following to fa.info-cvs, and I wanted to send this to you as a
cvs bug submission.


I have some information on this problem that may help to locate the root
I have recently done loadtesting of branch creation (rtag) and
usage of the resulting new branches.  The loadtest created 500 new branches
from a seperate base branch.  It then attempted to checkout, one at a time,
of the new branches, then add whitespace and checkin.  It does this
i.e., branch1 then branch2 and so on.  I noticed that the very first
after a new branching (rtag command) ALWAYS got the protocol error.  Any
checkouts after that, though, have worked without a hitch; also, if I did a
checkout once, then did a sleep of a few seconds, then attempted again, the
second was always successful.  This begs to be a timing issue.

Not knowing much of the details of CVS server internals, and not having
thoroughly at that code yet, I am taking an educated guess that there is
sort of one-time data creation performed upon the very first command against
a new branch.  I am further deducing that this is in a seperate child
which is not finishing with that task before the child associated with the
client's request is finished.  This results in a lack of needed data by the
client's child at the moment that it expects to find that clump of data,
the protocol error.  This is why waiting a moment after this first try, then
trying again, results in success; the first child doing the special one-time
processing is given time to finish.  I say one-time processing because this
never happens again to the same branch.

I have taken a look at the cvs codebase for 1.11.1p1 and found that a
change made since the last revision is convincingly tied into this problem.
particular chunk of code was removed from a loop that forced it to wait on a
relevant event -- this was done, apparently, to fix a problem of
waits in particular cases (or possibly infinite loop?).  However, it may
actually opened the door for this problem -- and indeed, this problem may
be a 'particular' case, as mentioned in previous postings, with very large
code bases/modules.  That would certainly explain why some child process is
taking unusually long to finish generating some needed data.

The code to which I refer is to be found in server.c.  It was removed from
main loop (I have this as being lines 2909 - 3130 in function

        while (stdout_pipe[0] >= 0
               || stderr_pipe[0] >= 0
               || protocol_pipe[0] >= 0
               || count_needed <= 0)

Here is the code chunk that was removed (I have this as lines 3132 - 3149):

         * OK, we've gotten EOF on all the pipes.  If there is
         * anything left on stdoutbuf or stderrbuf (this could only
         * happen if there was no trailing newline), send it over.
        if (! buf_empty_p (stdoutbuf))
            buf_append_char (stdoutbuf, '\n');
            buf_copy_lines (buf_to_net, stdoutbuf, 'M');
        if (! buf_empty_p (stderrbuf))
            buf_append_char (stderrbuf, '\n');
            buf_copy_lines (buf_to_net, stderrbuf, 'E');
        if (! buf_empty_p (protocol_inbuf))
            buf_output0 (buf_to_net,
                         "E Protocol error: uncounted data discarded\n");

I hope this can shed some light on the problem.  Timing issues are always
very difficult to pinpoint and fix.

--Jen (BIRT)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]