[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GNU Parallel Bug Reports A suggestion: --shuf and -k
From: |
paralleluser |
Subject: |
Re: GNU Parallel Bug Reports A suggestion: --shuf and -k |
Date: |
Fri, 30 Jun 2017 17:58:31 -0400 |
True, if the input names are easily sortable. They are in the example I
proposed, but in my real life example they are not easily sortable. With your
sort idea, you could throw on a "--tag" and then sort the output.
I use the -k to paste data, as an ordered vector, back into Excel or
Matlab/Octave/R. So yes, the order can be executed arbitrarily, but to keep
the vector indices in the same order, you need a -k or a sort (if your data
allows that, or if you can force it to be so). The only "order of sort rule"
is whatever you have in your pre-defined matrix. If you can force it to be
ASCII order, yes sort works. If you cannot or doing so would be a pain, then
-k has value, I think.
On Fri, Jun 30, 2017, at 05:46 PM, Rob Sargent wrote:
> -1
> (If jobs can be started independent of order, so too is the analysis of the
> output. From your description, the problem is solved with a call to sort.)
> > On Jun 30, 2017, at 3:11 PM, paralleluser <address@hidden> wrote:
> >
> > Friends
> >
> > A suggestion that merits your comments and review:
> >
> > --shuf does exactly what the man page says it does, but when you combine
> > --shuf and -k, the -k does nothing, --shuf rules over -k
> >
> > I'm going to propose that combining --shuf and -k that this happens:
> >
> > the jobs are still processed randomly
> > but the output be in order as the true input
> >
> > When do you use this? Assume your input to parallel process is:
> >
> > server1/resource1
> > server1/resource2
> > server1/resource3
> > ...etc...
> > server2/resource1
> > server2/resource2
> > ...etc...
> > ...up to server50
> >
> > For human processing reasons, it is easier to keep all the server/resource
> > input lines in ASCII sort order
> >
> > For computer processing reasons, server 1 is going to hate you if you are
> > hitting it with a lot of requests all at the same time
> >
> > Thus with the "--shuf -k" combo, the sever loads will be spread around, but
> > you will get your data back in the same order.
> >
> > Comments welcome........thanks
> >
>