pan-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pan-devel] Re: [Pan-users] Making Multiple Servers Happen In Your Lifet


From: Charles Kerr
Subject: [Pan-devel] Re: [Pan-users] Making Multiple Servers Happen In Your Lifetime
Date: Mon, 7 Oct 2002 11:46:26 -0700
User-agent: Mutt/1.3.20i

[Setting followups to pan-devel]

On Mon, Oct 07, 2002 at 03:06:23AM -0700, Duncan wrote:
> As for what it can do now... On the face of it, the servers are separate.  
> However, I was thinking about that in regards to it's cache handling the 
> other day and wondering...  I haven't investigated in detail, but at first 
> blush, it appears Pan keeps the group data separate by server, but has a 
> common actual message cache, which appears to be stored based on MsgID, which 
> remains the same between servers.  Thus, it's likely that while a message 
> read on one server won't show as read on another, because that's tracked 
> separately, if you go to retrieve it on the second b4 deleting it off the 
> first (IOW, while the physical message is still in the cache), it shouldn't 
> have to d/l it again, and should immediately see it is there, already.
> 
> At a minimum, due to the unified cache organized by MsgID, it should be far 
> easier to get it working that way eventually.  However, this thought is 
> fairly new to me, and I haven't had a real chance to explore how far it 
> works, by loading the same group on two different servers, so it's all 
> supposition at this point.

Yes, the unique Message-ID is the one thing we've got in our favor --
it is be the key in any lookup table we use for cross-server support.
However cross-server support hinges upon doing index numbers right.

News servers optimize article lookups in a group by having an index
number for each article in the group.  User-Agents like Pan ask for
articles by index rather than by message-id.  (Though the NNTP spec says
you can request articles by message-id, many servers don't honor the
request or do so with varying degrees of success)  For bonus fun,
a crossposted article has a different index for each server+group pair.

So to identify an article with enough detail to search across separate
servers, and to ensure that a single delete/save/read propagates the
state across servers & groups, the Message-ID needs to map to tuples
of [server,group,index].  Happily we can get this information by parsing
the Xref headers fetched from each server.

(I wrote tasks.dtd with this in mind -- see task.xml's message identifiers ;)

Chris and I have talked about replacing Pan's current data file format with
SQLite <http://www.hwaci.com/sw/sqlite/>, which is small, fast, and portable
enough to to not scuttle the Windows port.   Letting a database map the
message-id to [server,group,index] tuples would be much better than munging
the current data files to do this,  since currently each server+group pair
has its own file, and read article indices are stored in a per-server file.

An issue tied to managing these msgid->(server,group,index)+ relations
is how to track read articles.  Mapping msgid to a "read" flag is easy
to write, but it's insufficient for importing/exporting newsrc files:
if we key off the msgid internally, any article in the newsrc string
that's been deleted in Pan will show up as unread in Pan's exported newsrc
file:

   (1) Pan user imports a .newsrc from her other newsreader.  It includes:
       "alt.binaries.sounds.mp3.jackhammers: 1-8000, 8010-8014, 8020"
   (2) User in Pan deletes some articles which had the indices 8011 and 8013
   (3) User exits Pan, which writes the following line to .newsrc:
       "alt.binaries.sounds.mp3.jackhammers: 1-8000, 8010,8012,8014, 8020"
   (4) Back in the other newsreader, articles 8011 and 8013 are now unread.

To have isomorphic .newsrc import/exports, it would be better to keep
read/unread flags markings in a [server,group,newsrc] tuple where the newsrc
is some representation of a single newsrc line (in the db, a newsrc string;
in Pan, a pan/base/Newsrc object).  This needs to be taken into account to get
cross-server articles right.

The next step to getting cross-server harvesting right is, IMO, to get
the tables defined right and to move over to SQLite.  I'd be interested
in any feedback/discussion/action on this on pan-devel.

cheers,
Charles




reply via email to

[Prev in Thread] Current Thread [Next in Thread]