[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] Request for sortm feature to implement arbitrary messa

From: Ralph Corderoy
Subject: Re: [Nmh-workers] Request for sortm feature to implement arbitrary message ordering
Date: Thu, 26 Jun 2014 12:03:27 +0100

Hi Norm,

> This request is for sortm to have a '-program path-name' feature
> path-name would name a command which would be invoked by sortm. It
> would be given 2 arguments, full path names of messages. Its standard
> output would either
>     begin with "+", meaning that the first argument is to be regarded
>     as greater than the second,
>     or
>     begin with '-", meaning that the first argument is to be regarded
>     as less then the second
>     or
>     be empty  or begin with some other character, meaning that sortm
>     would use its usual mechanisms to relate the two arguments.

What if they are equal?  How about the result characters for a
comparison of foo and bar being `<', `=', and `>'.  Should the sort be
defined as stable given equal comparisons?  I think that would be useful
else each run could annoyingly permute equal emails into a new order.

> A non-zero exit status would be a fatal to sortm.

I see Ken's point about using exit status, but I think it's too easy for
a script to `exit 1' without meaning to give a comparison result, e.g.
`set -e' is in place and grep, unexpectedly, doesn't find any matches.
So I think stdout is probably the better channel.  I'd like to see  it
be precisely defined as two bytes then EOF, second being `\n'.

> sortm would assume, without checking, that the ordering imposed by the
> command was transitive and anti-symmetric. That is, that a<b and b<c
> implies a<c and that a<b implies b>a.

By implication, the comparison program is buggy if that doesn't hold?
sortm(1) punts to qsort(3) for the hard graft and that demands
consistency;  I think I'd like sortm to protect me from a buggy
comparison program.

kre has brought up the issue of fork/exec overhead, and I was wondering
whether a qsort-like cmp program is the right interface.  What about
taking a leaf out of Python's sort and providing a program to generate a
`key' line for each argument.  So, to sort by size of the mail file a
wrapper for stat(1) could be used.

    $ stat -c '%010s' /etc/passwd /etc/group

The program could be invoked several times, at most once per mail
message, to gather all the keys.  They'd be compared by strcoll(3) by
default, not strcmp(3).  There must be one, possibly empty, line per
argument, and it must exit(0).

(I see subsort() and txtsort() use strcmp(3) rather than strcoll(3).
http://git.savannah.gnu.org/cgit/nmh.git/tree/uip/sortm.c#n495 Is this
still desired?)

> If there were several  '-program path-name' pairs the last one would
> be used.  -noprogram would cancel any previous '-program path-name'
> pair.

I know this is in the MH tradition, but I find it a bit restrictive.

Taking sort(1)'s multiple -k options as inspiration, what about allowing
the sortm options multiple times, as kre suggests, including the new
key-program one?

    sortm -kp ./msgsize -kh x-mailer -kd date -keyd resent-date

I've always found sortm's logic over -textfield and -datefield very
contorted and not useful, e.g.

    With -textfield field, if -limit days is specified, messages with
    similar textfields that are dated within `days' of each other appear
    together.  Specifying -nolimit makes the limit infinity.  With
    -limit 0, the sort is instead made textfield-major, date-minor.

New -kheader, -kdate, -kprogram options could respectively strcoll()
decoded headers, compare dates, and kick off an external key-generating
program.  A character not acceptable in header names, colon?, could be
used to suffix flags, e.g. `-kh x-priority:nr' to have X-Priority's `42'
be sorted numerically in reverse order.  This allows -kdate to be
dropped;  it's -kheader with a `d' for date flag.  These flags apply to
-kp too, so stat(1) need not pad with 0s if `:n' is given.

The two new -k[hp] options would be mutually exclusive with the old
-{text,date}field;  both could not be given.  This allows for a clean
break with the old.

Does that cover your needs?  As kre said, this can all be done with the
anno+sortm+anno dance I showed in
but it would be nice to give nmh native modern sorting.

Cheers, Ralph.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]