[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Pan-devel] Re: [Pan-users] Making Multiple Servers Happen In Your Lifet
From: |
Charles Kerr |
Subject: |
[Pan-devel] Re: [Pan-users] Making Multiple Servers Happen In Your Lifetime |
Date: |
Mon, 7 Oct 2002 11:46:26 -0700 |
User-agent: |
Mutt/1.3.20i |
[Setting followups to pan-devel]
On Mon, Oct 07, 2002 at 03:06:23AM -0700, Duncan wrote:
> As for what it can do now... On the face of it, the servers are separate.
> However, I was thinking about that in regards to it's cache handling the
> other day and wondering... I haven't investigated in detail, but at first
> blush, it appears Pan keeps the group data separate by server, but has a
> common actual message cache, which appears to be stored based on MsgID, which
> remains the same between servers. Thus, it's likely that while a message
> read on one server won't show as read on another, because that's tracked
> separately, if you go to retrieve it on the second b4 deleting it off the
> first (IOW, while the physical message is still in the cache), it shouldn't
> have to d/l it again, and should immediately see it is there, already.
>
> At a minimum, due to the unified cache organized by MsgID, it should be far
> easier to get it working that way eventually. However, this thought is
> fairly new to me, and I haven't had a real chance to explore how far it
> works, by loading the same group on two different servers, so it's all
> supposition at this point.
Yes, the unique Message-ID is the one thing we've got in our favor --
it is be the key in any lookup table we use for cross-server support.
However cross-server support hinges upon doing index numbers right.
News servers optimize article lookups in a group by having an index
number for each article in the group. User-Agents like Pan ask for
articles by index rather than by message-id. (Though the NNTP spec says
you can request articles by message-id, many servers don't honor the
request or do so with varying degrees of success) For bonus fun,
a crossposted article has a different index for each server+group pair.
So to identify an article with enough detail to search across separate
servers, and to ensure that a single delete/save/read propagates the
state across servers & groups, the Message-ID needs to map to tuples
of [server,group,index]. Happily we can get this information by parsing
the Xref headers fetched from each server.
(I wrote tasks.dtd with this in mind -- see task.xml's message identifiers ;)
Chris and I have talked about replacing Pan's current data file format with
SQLite <http://www.hwaci.com/sw/sqlite/>, which is small, fast, and portable
enough to to not scuttle the Windows port. Letting a database map the
message-id to [server,group,index] tuples would be much better than munging
the current data files to do this, since currently each server+group pair
has its own file, and read article indices are stored in a per-server file.
An issue tied to managing these msgid->(server,group,index)+ relations
is how to track read articles. Mapping msgid to a "read" flag is easy
to write, but it's insufficient for importing/exporting newsrc files:
if we key off the msgid internally, any article in the newsrc string
that's been deleted in Pan will show up as unread in Pan's exported newsrc
file:
(1) Pan user imports a .newsrc from her other newsreader. It includes:
"alt.binaries.sounds.mp3.jackhammers: 1-8000, 8010-8014, 8020"
(2) User in Pan deletes some articles which had the indices 8011 and 8013
(3) User exits Pan, which writes the following line to .newsrc:
"alt.binaries.sounds.mp3.jackhammers: 1-8000, 8010,8012,8014, 8020"
(4) Back in the other newsreader, articles 8011 and 8013 are now unread.
To have isomorphic .newsrc import/exports, it would be better to keep
read/unread flags markings in a [server,group,newsrc] tuple where the newsrc
is some representation of a single newsrc line (in the db, a newsrc string;
in Pan, a pan/base/Newsrc object). This needs to be taken into account to get
cross-server articles right.
The next step to getting cross-server harvesting right is, IMO, to get
the tables defined right and to move over to SQLite. I'd be interested
in any feedback/discussion/action on this on pan-devel.
cheers,
Charles
- [Pan-devel] Re: [Pan-users] Making Multiple Servers Happen In Your Lifetime,
Charles Kerr <=