[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Pan-users] Re: performance ...
From: |
Duncan |
Subject: |
[Pan-users] Re: performance ... |
Date: |
Tue, 9 Dec 2008 15:17:51 +0000 (UTC) |
User-agent: |
Pan/0.133 (House of Butterflies) |
address@hidden posted
address@hidden, excerpted below, on
Tue, 09 Dec 2008 13:41:28 +0100:
>> But what you're seeing is normal. Keep in mind that if pan is saying a
>> million articles, that's after combining multiparts. In some groups,
>> that could mean ten or fifty million actual single-part articles.
>
> I was referring to the total article count (not the thread count) My
> largest groups file is 900 MB.
But how are you counting articles? Are you using the pan unread count or
something else? If you're using pan unread count, it's counting multiple
parts (not threads, parts of a single multipart message, aka multi-
segment) as a single entry. If you look you can see it, 15/15 or
whatever, or sometimes a missing part, 14/15, with the corresponding
broken puzzle icon instead of the full puzzle icon.
With old-pan you could separate the parts into the individual pieces,
which then showed up as "threads". With new-pan, it's displayed only
once, tho actual replies are still shown threaded.
> Perhaps using memory maps might speed up things ? Also the data seems to
> be writting in ASCII format, requiring rescan/repars every time.
> Perhaps saving in binary, which allows even more efficient use of memory
> maps might be usefull (Option only for large groups perhaps ? ) It might
> not reduce the size of the file but it will avoid having to convert lots
> of integers (like line numbers, sizes, dates etc). Also it would allow
> to read in blocks without having to process those blocks.
Perhaps as an option. Note that binary is a much more opaque format,
much harder to repair manually if necessary.
> that is true but when you know you might need to treat 1G of data, you
> start managing the data cleaverly. Generally you try to save the work
> that you did for later purposes. E.g. if you have already figured out
> certain things, you store that info so that you don't have to figure it
> out later on.
As I said, pan now does save its work. Old-pan used to re-thread every
time you loaded the group.
>> Meanwhile, how do you monitor CPU usage? Are you monitoring it per
>> core, or overall only? Most of new-pan is single-threaded, because
>> Charles had gone with multi-threaded in old-pan and found the
>> complexity and thread- race bugs just not worth it for the limited
>> increase in performance. Instead, new-pan now hatches threads only in
>> limited performance critical sections (like when starting multiple
>> connections at once, one place I know it's used as I remember Charles
>> fixing a bug I had with it). So pan will likely be using near 100% of
>> a single core, but the others should remain mostly idle, I /think/.
>> (It has been awhile since I did binaries and IDR for sure.)
>
> No when it is busy doing stuff and blocking other apps from doing
> something I ran top and it showed pan using about 80% cpu, constantly
> for a certain time.
Yes, but what are your top settings? Are you showing each individual
core separately or are you only showing the combined, and are you using
irix or solaris mode? I'm asking because depending on setting, using all
four at 100% each it could call that 400% or 100%, with 100% of a single
core being correspondingly 100% or 25%, all depending on how you have top
set.
You're using Kubuntu so you should be able to setup a ksysguard graph if
you like. I don't know if you're on KDE 3.5 or 4.x but 4.x is still
broken for daily use for me (4.2 should fix most of it AFAIK), so I'm
using 3.5.10 still, with a ksysguard kicker applet at the top of my
screen. Its first four graphs are user/system/nice CPU on each of the
four cores, so when I'm in KDE 3 anyway, I get a live updating graph of
activity on each of the four cores. (FWIW, next is load, then memory,
then swap which is normally zero, then up and down network traffic, then
multiplexed disk activity, then the four CPU core-temps, then two
additional system temps. I'm running two 1920x1200 LCDs stacked for
1920x2400, with the ksysguard applet taking up nearly 1500 px width at
the max 300 px kicker panel height, on the top LCD.)
>> Also, it may be disk I/O related, if you have a single disk only and
>> that group's data isn't in cache yet. I run a dual dual-core Opteron
>> 290 (2.8 GHz) here, so have four cores too, but I'm running
>> Gentoo/~amd64 with everything compiled to my specific hardware, which
>> will help some (BTW, you didn't mention whether you were running 32-bit
>> or 64-bit kubuntu, 4 gigs on 32-bit is going to be less efficient than
>> 4 gigs on 64-bit), and I run a 4-disk kernel/md RAID, with pan's data
>> on RAID-6, which means it's two-way striped. RAID striping really
>> /does/ help, and not just with pan; you might be surprised how much.
>
> Yes i have been considering switching since
>
> 1. my 4 GB is not used (because of memory of graphics card)
FWIW you should be able to configure the 32-bit kernel for 64-gig mode if
you like, or probably download one so configured from Ubuntu (possibly
named 686-bigmem, the Debian name AFAIK). If the BIOS will remap the
memory, you should then get the memory ordinarily covered by the legacy
32-bit PCI I/O hole (typically half a gig or so, sometimes a full gig)
mapped above the 4-gig boundary then. This works using PAE mode, AFAIK.
Here's a bit about it. The title says a gig, but it talks about both the
HIGHMEM-4G and HIGHMEM-64G options.
http://www.linux.com/feature/119287
But in 32-bit mode that's less efficient as it has to effectively page
the memory into a window it can actually address. 64-bit of course
eliminates that. And... not all BIOS support it, 32-bit or 64-bit,
unfortunately, altho most of the newer ones will in 64-bit at least.
> 2. indeed my disk seems to be the bottleneck.
> However I need to completely upgrade my box and that is a hard job.
> Also I have no experience setting up RAID (donno even if my mobo
> supports it)
FWIW, I had no experience with it either, until I had two drives go out
in two years and decided I needed a bit more reliability than that. So I
upgraded to 4xSATA drives (my mobo supported it in firmware RAID mode but
that sucks in Linux since it's really software RAID anyway, and proper
kernel RAID is more reliable, so I set it to straight SATA mode and used
the kernel RAID) and after some research and planning it all out, set it
up.
If you decide to do it and know nothing of RAID, you'll want to google
for the free chapter of O'Reilly's Linux RAID book. It's an excellent
intro, explaining the difference between hardware, firmware and kernel/md
RAID, and the various RAID levels. That's where I started as I knew very
little about it before that.
After that, you'll want to read the kernel's md.txt document (in your
kernel Documentation subdir) and look at the HOWTOs. In particular, keep
in mind that if you're going to boot off the RAID, you need a small
RAID-1/mirrored partition to hold /boot, since RAID-1 is all either LILO
or GRUB understand. When I setup, there wasn't a lot of info out there
about mdp/partitioned-RAID yet, as it was still pretty new, but I managed
to find what I needed. You will also want to consider LVM2 on top of
RAID, the way people handled it before partitioned RAID, but while you
can boot directly to RAID using an appropriate kernel command line,
unfortunately LVM2 requires an initramfs/initrd. I chose not to use
that, so I put my root filesystem and a backup on partitioned RAID, and
almost everything else I wanted to keep redundant on an LVM on top of
RAID, setup so I could load LVM after I had my rootfs on the partitioned
RAID already going.
That's 10km high overview at a few hundred km/hr! =:^) It sounds
confusing condensed like that, but take it a step at a time as I did, and
you should be fine, as I was/am. =:^) If you get stuck, you know someone
to mail for help. =:^)
Not to pressure you if you don't believe your ready, but really, if
you're already running quad-core and 4 gig RAM, a single spindle hard
drive IS the bottleneck, and you'll find the system not only faster, but
much more responsive, once you effectively get that millstone of your
neck. I just think it's such a shame to have a nice system bogged down
like that.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman