pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] Re: Search messages


From: Alan Meyer
Subject: Re: [Pan-users] Re: Search messages
Date: Mon, 27 Apr 2009 20:30:38 -0700 (PDT)

Duncan <address@hidden> wrote:
...
> For search, however, there's a workaround, provided the message
> is still in the (10 MB by default) cache, reasonably likely for
> text-only users but not so much for those doing binaries where
> the cache is tiny.  Just do a filesystem (not pan) search of
> the cache (a subdir of ~/.pan2 by default, named article-cache
> or article_cache I'm not sure which as it changed at some time
> in the past and I ended up symlinking one to the other, here).
>
> Filesearching the cache, you'll come up with a file or list of
> files with names matching (as closely as easily possible on a
> filesystem, a few strange message-id characters may be replaced
> by more commonly allowed chars) the message-ids of the messages
> in question.  You can then open those files, basically the raw
> text format of the messages in question, in a normal text
> editor, or use pan's message-id search as mentioned above to
> find them in pan.

Here's a little utility program that makes searching the files
easier.  I use it to reduce a large article list to the bare
essentials I want for searching.  Each line will have:

    Date Time Subject (num parts)

Then I search the output with less or grep.

The files on my system are in ~/.pan2/groups.

    Alan

----------------------  cut here  ------------------------
#!/usr/bin/python

########################################################
# uniqpan.py
#
# Filter a Pan newsgroup articles file to find article subjects and dates.
#
#               Author: Alan Meyer
#              License: Free under the GNU GPL.
########################################################
import sys, time, re

if len(sys.argv) != 2:
    sys.stderr.write("""
usage: uniqpan.py article_filename
  e.g.,
     cd ~/.pan2/groups
     uniqpan.py article_filename | sort > whatever.txt
""")
    sys.exit(1)

idPat = re.compile("^<.*>")

try:
    fp = open(sys.argv[1], "r")
except IOError, info:
    sys.stderr.write("%s: %s" % (sys.argv[1], str(info)))
    sys.exit(1)

while True:
    line = fp.readline()
    if not line:
        break

    idMatch = idPat.match(line)
    if idMatch is not None:
        # Found an article message-id
        # Get succeeding lines
        title = fp.readline().strip()
        authorCode = fp.readline()
        timePosted = fp.readline()
        dateTm = (0,0,0)
        try:
            # Next line may be time or may be references
            dateTm = time.gmtime(float(timePosted))
        except:
            # Try the line after
            try:
                timePosted = fp.readline()
                dateTm = time.gmtime(float(timePosted))
            except:
                # Give up
                dateTime = None

        # Time as YYYYMMDD
        dateTime = "%04d%02d%02d %02d%02d%02d" % \
          (dateTm[0], dateTm[1], dateTm[2], dateTm[3], dateTm[4], dateTm[5])

        # Output what we've got
        sys.stdout.write("%s %s\n" % (dateTime, title))


      




reply via email to

[Prev in Thread] Current Thread [Next in Thread]