Re: [Pan-users] Anyone know why pan occasionaly decides it needs to down

pan-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] Anyone know why pan occasionaly decides it needs to down

From:	Duncan
Subject:	Re: [Pan-users] Anyone know why pan occasionaly decides it needs to download all headers in a group?
Date:	Tue, 19 Jun 2012 07:16:05 +0000 (UTC)
User-agent:	Pan/0.138 (Der Geraet; GIT f50ed2b /usr/src/portage/src/egit-src/pan2)

Jim Henderson posted on Thu, 01 Mar 2012 21:32:44 +0000 as excerpted:

> I've got a handful of servers I subscribe to groups from.  About once
> every day or two, with random groups (which one varies, and all of them
> have been hit from one time to another), Pan will decide that instead of
> using the headers in its cache, it's going to pull all the headers for
> that group.
> 
> I observed this behaviour before I started using multiple machines, but
> it wasn't a huge issue - but now I sync the .pan2 and News directories
> to other machines using Dropbox, so having Pan decide to download all
> headers means that the group's file in the .pan2/articles directory
> grows to a fairly large size - which affects how much space is used on
> my Dropbox account and how long it takes to sync the data.
> 
> I've been working around the issue by killing the task, deleting the
> group's headers, and telling Pan to download 14 days' worth of headers.
> But it's somewhat annoying to have to do that, so I thought I'd ask.
> 
> I'm running Pan 0.135 on openSUSE 12.1.  I also observed this behaviour
> on earlier 0.13x releases on earlier releases of openSUSE (back to at
> least 11.2).

I had this marked for later reply... but didn't think it'd be /this/ much 
later!  However, despite the time, hitting view-all (including read), I 
see no other responses.  Maybe this will bring some.

I don't believe I've ever seen this sort of behavior (since before pan 
0.90) and I'm trying to think what might cause it.  

The way this normally works, with most news clients including pan, is 
that a server sequentially numbers the posts in each group, and when 
updating a group, a reply with three numbers, low water mark (the oldest 
message still around, lowest number), high water mark (the newest, 
highest number message), and an ESTIMATE of the number of messages 
available, will be given.

The client then compares its old high water mark with the new one and 
assuming the low water mark on the server overlaps the old high water 
mark at the client (no gap), the client requests all message headers (as 
overviews) between the old high water mark and the new one.  If the user 
hasn't checked in in awhile and the server has expired some messages 
between the old high water mark and its new low water mark, the client 
knows to only get the messages between the current low and high water 
marks.


OK, that's the general process.

It's also worth knowing that pan tracks the message sequence numbers it 
has seen in the newsrc files, one for each server, since the newsrc 
format assumed only a single server.  If you look in this file, you'll 
see the ranges of numbers that pan says it has already seen, so doesn't 
need to request again, for each group you've ever downloaded headers for.


OK, that's where pan keeps track of those numbers.

What would make pan try to redownload all headers, instead of just the 
ones since it last downloaded?

Somehow, that tracking is getting out of sync.


The most common way it gets out of sync is if you transfer servers and 
try to use the same sequence numbering, since each server will naturally 
have its own sequencing for a particular group.  This can also happen if 
a server's numbering gets "reset" for some reason or other, maybe because 
it lost its own records and got rebuilt.

If the new server's numbers are LOWER than the ones the client was 
tracking, then the client believes it has already seen everything, so 
won't pull new headers.  That's when you force it to pull all headers, or 
you setup a new server instead of trying to use the old one, or you 
either delete the newsrc or edit it to remove its tracking for just that 
group.

If the new server's numbers are HIGHER, then obviously, the client will 
think all the messages are new, and try to download all the headers 
available, back either to the client's old high water mark, or the 
server's current low water mark, whichever is fewer messages.


You mention that you're syncing between different machines.  If you sync 
newsrcs, but NOT servers.xml, AND you use the default newsrc numbering 
(that is, you didn't rename the newsrcs in servers.xml and the filename 
to reflect the server name instead of an arbitrary number, as I did 
here), it could be that the servers were added in a different order on 
the one machine.  IDR exactly what the default newsrc names are, but it's 
something like newsrc-1, newsrc-2, etc.  Now if newsrc-1 corresponds to 
server A on machine 1, but server A has newsrc2 on machine 2, then 
syncing them WITHOUT syncing the servers.xml file that maps one to the 
other as well, will screw you up, because pan will now believe the per-
server per-group sequential message numbering belonging to one server/
group actually belongs to a different one!


That's one possibility.


Another would be that something's corrupting or deleting the newsrcs on 
one or more of your machines, such that pan's losing track of which 
messages it has actually seen, and thus when told to download all new, 
downloads all of them.


Yet another possiblity, probably remote, is that one or more of your 
servers is having bad enough problems to lose its own message sequencing 
on a group once in awhile.

A variant of that one would be if the server is actually a server farm, 
probably served round-robin, and one of say five or ten servers is bad, 
out of sync with the rest.  Every time you happen to get the bad 
server...  (This one's actually reasonable and I've known it to happen 
with Highwinds based servers, in particular.  One bad front-end in their 
round-robin balancing can be a real frustration!  What can make it worse 
is that nobody likes the bad server, so it always looks as if it has less 
load than the others, and if they're using load stats rather than pure 
round-robin the odds of getting the bad front-end thus go up dramatically!
)

Yet another variant would be "cheap" providers like the old newsfeeds 
used to do.  They had lots of servers in various places, but they weren't 
coordinated or number-synchronized, so if you used one one day and 
another a different day, you pretty much HAD to use message-IDs, not 
message sequence numbering AT ALL.


Anyway, maybe you've fixed it by now, maybe not, but those are the 
triggers I came up with in the few minutes I was thinking about it 
writing this post.  If it's not fixed by now, see if any of those match 
what you are seeing...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Pan-users] Anyone know why pan occasionaly decides it needs to download all headers in a group?, Duncan <=
- Re: [Pan-users] Anyone know why pan occasionaly decides it needs to download all headers in a group?, Jim Henderson, 2012/06/19

Prev by Date: Re: [Pan-users] Win32 build of 0.138 up
Next by Date: Re: [Pan-users] Anyone know why pan occasionaly decides it needs to download all headers in a group?
Previous by thread: [Pan-users] Win32 build of 0.138 up
Next by thread: Re: [Pan-users] Anyone know why pan occasionaly decides it needs to download all headers in a group?
Index(es):
- Date
- Thread