pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] Policy discussion: GNKSA


From: Duncan
Subject: Re: [Pan-users] Policy discussion: GNKSA
Date: Mon, 4 Jul 2011 20:03:54 +0000 (UTC)
User-agent: Pan/0.135 (Tomorrow I'll Wake Up and Scald Myself with Tea; GIT 9996aa7 branch-master)

Rob posted on Mon, 04 Jul 2011 14:36:23 -0400 as excerpted:

> On Monday 04 July 2011 11:42, Alan Meyer wrote:
>> Of course in theory, theory and practice are the same.  But in practice
>> ... maybe I'm totally screwed up here.
> 
> Practically speaking, I hit binsearch.info and found an Ubuntu ISO to
> test with, currently the first result here:

Thanks for putting the practice to the theory for us.  That's the great 
thing about such discussions on lists/newsgroups.  If there's a 
reasonably sized group involved in the discussion, there's generally at 
least one person who can explain the theory, and one that's willing to 
test to see if reality and theory match. Alan and I seem to be the 
former; you're the first of the latter so far, so... thanks!  =:^)

Unsplit the url:

> http://binsearch.info/?q=ubuntu-10.04.1-desktop-
i386.iso&max=100&adv_age=600&server=

> I was going to post the result of [10/4/1] connections, but [meatspace]
> and the single-connection one is still going.  Each time, I created a
> new directory, put the nzb file into it, purged Pan's article cache,
> adjusted the max connections to the appropriate number and watched
> jnettop to verify it was working after issuing the following command
> line:
> 
> time pan --no-gui --nzb ubuntu\ iso.nzb -o `pwd`
> 
> I did 10 connections first, so that if there were any server-side
> caching it would be disadvantaged. [Here's 10/4, respectively.]
> 
> real  14m1.951s user  1m46.775s sys   0m31.602s
> 
> real  35m31.818s user 4m37.997s sys   1m11.544s
> 
> [60% savings] 10 connections vs. 4.
> [T]he single-connection one is still going [after 25 min.]  
> [Note that] the NZB created by the above query creates 2 copies of
> the ISO, [thus the 15 minutes for a 700 MB iso.]
> 
> This was using Giganews, a service which advertises 20 simultaneous
> connections for its lowest-end subscription level.  For my subscription
> level, it advertises 50 simultaneous connections.
> 
> Whether it's my ISP, Giganews, my router or something else throttling
> things on a per-connection basis, the only conclusion I can come up with
> is that Pan's attempts at GNKSA compliance are hamstringing its binary
> performance relative to more modern newsreaders, except for those of us
> who are comfortable tweaking XML files by hand.

That's the bit that /does/ bother me.

There's a few questions remaining with the above tests, too.

1) What's your max allowed pipe speed?  Current typical mid-range (US) 
DSL/Cable download speeds run probably 10-20 Mbps, down.  If you're 
connected via the 100 Mbps + available in some places, tho, that's a bit 
different, as would be the 3 Mbps or less connections typical a few years 
ago, and still the only reasonable cost alternative in many locations.

2) What's the encoding?  If it's traditional MIME/Base64 or UUE, there's 
an encoding overhead of 33-40%, making that ~1.4 gig closer to 1.9 gig of 
actual downloaded encoded data.  If it's yEnc, the overhead is a much 
more manageable ~5%, ~1.5 gig encoded.  That doesn't mess strictly 
relative comparisons like the above, but as the highest speed approaches 
your pipe max, it makes a big difference.

(Actually, after looking at the search, assuming you're talking the Peter 
Hannay posting to a.b.test, it says in the title, yEnc. Thus answering 
/that/ question. =:^)

3) This bit of course requires XML file editing as well (cache size would 
need to be >= 2 gig, up from the 10 MB default that pretty much forces 
download-and-save at once), but it'd be interesting to see what the 
download-to-cache time/speed was.  If there's a big difference tilted in 
favor of download-to-cache as I suspect there likely is, one way around 
the problem while staying within the current GNKSA 4-connection limit 
would be to up pan's default cache size and make download-to-cache the 
encouraged and default behavior.

Another way around it that would help for some, would be to move the 
decoding/saving to separate threads, so the download threads could 
immediately go back to downloading.  However, results there would vary, 
depending on local computer resources (single-core and speed vs. dual vs. 
quad vs. 6 vs. 8/12/whatever, and local storage speed, high-speed RAID-0 
or RAID-10, or high speed SSD, in preference to single spinning disk, and 
for the latter, current generation is likely to be somewhat better than 
few years old).  Threading the decode separately would only help if 
you're not CPU-bound or storage-bandwidth-bound already.  Obviously, 
they're not in your case (tho the 10-connection one /might/ be), but I 
doubt you're on a single-core <2 GHz with a single spinning drive from 
the era when <100 gig drives was still big, either.

4) Talking about local computer resources, I suppose at least dual-core 
is a reasonable assumption these days, but if you're getting that 
dramatic a difference on < quad-core and a single mechanical drive, it's 
WAY more interesting than if you're getting it on a 6-core+ high-speed SSD 
or RAID-0/RAID-10 setup.

5) Perhaps I'm old fashioned but I tend to still prefer wired local 
connections, for many reasons including local connection reliability, 
speed and test duplicability.  If you're on wireless, all sorts of weird 
stuff could interfere with an individual download, making individual 
results count rather less.

6) Lastly, to answer the "where's the knee" question, results from 6 and 
8 connections, and possibly from say 13,16,20,30 and 50 (since you have 
it...) if the results keep getting better as you go up, would be useful 
as well, but obviously that increases the work you'd have to do as well.  
And I don't know what your giganews per-month limit is.  If it's not 
unlimited, that number of tests could use up a rather larger chunk of 
that than you wish to dedicate to simple testing, as well.

#3 is the most interesting to me, but also requires the most work from 
you, especially if combined with #6.  Most of the others are simply 
reporting data you probably already have at hand.

But the results are certainly interesting (and potentially real-world 
useful for binary downloaders outside this debate as well) even as they 
stand now.  Thanks VERY much!

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




reply via email to

[Prev in Thread] Current Thread [Next in Thread]