pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] Making Multiple Servers Happen In Your Lifetime


From: Frank Bruno
Subject: Re: [Pan-users] Making Multiple Servers Happen In Your Lifetime
Date: Mon, 07 Oct 2002 17:46:04 -0400

BNR has this type of activity although it is not perfect. What they do is have
priority levels that you can set on servers and articles. The article priority 
sorts
the articles and the server priority will sort the server list (essentially). 
The
one feature they missed which I would recommend adding to the list is the 
ability to
mark a server to get all articles that it can from the tagged list exclusively. 
This
would be useful for an ISP newsserver which is particularly fast, but missing 
lots
of stuff. Then the premium servers would only grab content that the ISP server
didn't have.
-Frank

"Mark H. Kraml" wrote:

> If we keep talking about this feature, we may get some good ideas down
> and actual traction on the feature may follow.
>
> I just signed up for a premium news account, clearly this feature would
> be of great benefit. After giving it some thought it is, as stated
> earlier, not so simple to do. A least not without a good hit on
> performance. While my regular ISP has a limited amount of storage for
> news (about 1-2 days), the premium ISP has over 20 days of storage. This
> makes some newsgroups that have over 1 million headers.
>
> It take a long time to download "ALL" headers from there. They also give
> me a 6GB limit for 30 days, this means I really only want to get from
> there if I really have to.
>
> On the topic of performance, I have a 1GHz P3 with 512MB RAM, not the
> fastest round, but not out of the norm of machines we would want to
> support. Even today, with 1 million headers, the performance for loading
> and managing groups is quite, well, unimpressive.
>
> On Mon, 2002-10-07 at 14:46, Charles Kerr wrote:
> > [Setting followups to pan-devel]
> >
> > On Mon, Oct 07, 2002 at 03:06:23AM -0700, Duncan wrote:
> > > As for what it can do now... On the face of it, the servers are separate.
> > > However, I was thinking about that in regards to it's cache handling the
> > > other day and wondering...  I haven't investigated in detail, but at first
> > > blush, it appears Pan keeps the group data separate by server, but has a
> > > common actual message cache, which appears to be stored based on MsgID, 
> > > which
> > > remains the same between servers.  Thus, it's likely that while a message
> > > read on one server won't show as read on another, because that's tracked
> > > separately, if you go to retrieve it on the second b4 deleting it off the
> > > first (IOW, while the physical message is still in the cache), it 
> > > shouldn't
> > > have to d/l it again, and should immediately see it is there, already.
> > >
> > > At a minimum, due to the unified cache organized by MsgID, it should be 
> > > far
> > > easier to get it working that way eventually.  However, this thought is
> > > fairly new to me, and I haven't had a real chance to explore how far it
> > > works, by loading the same group on two different servers, so it's all
> > > supposition at this point.
> >
> > Yes, the unique Message-ID is the one thing we've got in our favor --
> > it is be the key in any lookup table we use for cross-server support.
> > However cross-server support hinges upon doing index numbers right.
> >
> > News servers optimize article lookups in a group by having an index
> > number for each article in the group.  User-Agents like Pan ask for
> > articles by index rather than by message-id.  (Though the NNTP spec says
> > you can request articles by message-id, many servers don't honor the
> > request or do so with varying degrees of success)  For bonus fun,
> > a crossposted article has a different index for each server+group pair.
> >
> > So to identify an article with enough detail to search across separate
> > servers, and to ensure that a single delete/save/read propagates the
> > state across servers & groups, the Message-ID needs to map to tuples
> > of [server,group,index].  Happily we can get this information by parsing
> > the Xref headers fetched from each server.
> >
> > (I wrote tasks.dtd with this in mind -- see task.xml's message identifiers 
> > ;)
> >
> > Chris and I have talked about replacing Pan's current data file format with
> > SQLite <http://www.hwaci.com/sw/sqlite/>, which is small, fast, and portable
> > enough to to not scuttle the Windows port.   Letting a database map the
> > message-id to [server,group,index] tuples would be much better than munging
> > the current data files to do this,  since currently each server+group pair
> > has its own file, and read article indices are stored in a per-server file.
> >
> > An issue tied to managing these msgid->(server,group,index)+ relations
> > is how to track read articles.  Mapping msgid to a "read" flag is easy
> > to write, but it's insufficient for importing/exporting newsrc files:
> > if we key off the msgid internally, any article in the newsrc string
> > that's been deleted in Pan will show up as unread in Pan's exported newsrc
> > file:
> >
> >    (1) Pan user imports a .newsrc from her other newsreader.  It includes:
> >        "alt.binaries.sounds.mp3.jackhammers: 1-8000, 8010-8014, 8020"
> >    (2) User in Pan deletes some articles which had the indices 8011 and 8013
> >    (3) User exits Pan, which writes the following line to .newsrc:
> >        "alt.binaries.sounds.mp3.jackhammers: 1-8000, 8010,8012,8014, 8020"
> >    (4) Back in the other newsreader, articles 8011 and 8013 are now unread.
> >
> > To have isomorphic .newsrc import/exports, it would be better to keep
> > read/unread flags markings in a [server,group,newsrc] tuple where the newsrc
> > is some representation of a single newsrc line (in the db, a newsrc string;
> > in Pan, a pan/base/Newsrc object).  This needs to be taken into account to 
> > get
> > cross-server articles right.
> >
> > The next step to getting cross-server harvesting right is, IMO, to get
> > the tables defined right and to move over to SQLite.  I'd be interested
> > in any feedback/discussion/action on this on pan-devel.
> >
> > cheers,
> > Charles
> >
> >
> > _______________________________________________
> > Pan-users mailing list
> > address@hidden
> > http://mail.freesoftware.fsf.org/mailman/listinfo/pan-users
> >
>
> _______________________________________________
> Pan-users mailing list
> address@hidden
> http://mail.freesoftware.fsf.org/mailman/listinfo/pan-users





reply via email to

[Prev in Thread] Current Thread [Next in Thread]