pan-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-devel] Scoring articles by ration of bytes/lines


From: Charles Kerr
Subject: Re: [Pan-devel] Scoring articles by ration of bytes/lines
Date: Wed, 31 Jan 2007 19:17:33 -0600
User-agent: Thunderbird 1.5.0.9 (X11/20061219)

Konrad Karl wrote:
Hi,

I want to be able to score/filter/delete articles where
the ration of article_bytes / article_lines is below a certain
value.

Many sporged postings could be easily identified. With
the old pan 1.x I have been using a simple perl filter program
in oder to delete articles with a too low ratio and this simple
approach worked surprisingly well - the algortithm might require some
tweaking, e.g if number of lines < 10 then dont apply the
ratio rule etc. etc.

Now I have started looking into the latest sources but I am
afraid it will take considerable time until I will understand
whats going on.

What do you think?

Greetings,
Konrad

Hi Konrad,

This can be done in 0.120 by adding a scoring rule to ignore
all articles with a line count less than 10.
See Article > Add a Scoring Rule

cheers,
Charles




reply via email to

[Prev in Thread] Current Thread [Next in Thread]