pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] disappearing articles?


From: Duncan
Subject: Re: [Pan-users] disappearing articles?
Date: Sun, 25 May 2003 04:03:19 -0700
User-agent: KMail/1.5.1

On Fri 23 May 2003 18:24, Chris Petersen posted as excerpted below:
> I finally bought the bullet and subscribed to a pay-for usenet server,
> and thus have the bandwidth to play in some of the binary groups.
>
> In doing so, I've noticed that the article count gets weird sometimes -
> I'll download a bunch of headers (say, 50k), but when all is done, only
> 41k will show up.  I've tried my best to turn off all filters/rules, so
> nothing should be getting excluded - but I fear that it is.  Is this
> just Pan correcting for some server stuff, it is it filtering stuff when
> I tell it not to (since I've had plenty of times where I tell it to
> filter stuff out and it doesn't)?
>
> oh, using .14 in linux.

It's possible you've hit on a bug, but I wouldn't jump to that conclusion just 
yet, as there are times when that would be the expected behavior.

When getting the initial article count estimates, most readers including PAN 
simply use the article sequence numbers for that group on that server, to 
come up with the estimate.  If the previous highest number you'd processed 
was 18277397, and the new high number is 18278399, then it will report 1002 
new articles in the group. Some servers, for whatever reason, create message 
numbers out of order.  Or, more precisely, they end up on the reading server 
out of order, with gaps in the numbering.

One reason this might occur is that articles might be centrally numbered, then 
distributed to several servers.  The reason for this would be to keep all 
those servers in numerical sync, so one could switch between them without 
messing up the read article tracking and etc, as those numbers (rather than 
the more accurate message-ID, which is supposed to be a GUID-->globally 
unique identifier, while the server group message sequence numbers are only 
unique within the group and on that server, normally) are what most readers 
track d/led and read articles by.  My ISP, Cox, does this, having a central 
feed processing location that does the numbering, and three servers, east, 
central, and west, that are numerically synced.  Guess what happens when one 
gets behind?  Right, it gets some of the articles but not others, then when 
the problem is corrected, the missing ones show up.  That creates gaps in the 
sequencing numbers, making those estimates inaccurate.

Another reason the estimate may be wrong is due to filtering and cancels, the 
latter if the server processes cancels, of course.  The numbering could be 
done b4 or after such filtering, but since cancels typically arrive somewhat 
later, that would always create holes in the numbering, again, if your server 
processes cancels, of course.

There are probably other reasons numbering may be off, as well, but this 
should be enough to demonstrate why such initial numbers are only educated 
guestimates, anyway, and shouldn't be taken as more than that.

-- 
Duncan - List replies preferred.
"They that can give up essential liberty to obtain a little
temporary safety, deserve neither liberty nor safety." --
Benjamin Franklin





reply via email to

[Prev in Thread] Current Thread [Next in Thread]