[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: datamash performance question

From: Erik Auerswald
Subject: Re: datamash performance question
Date: Sat, 26 Jun 2021 19:36:06 +0200
User-agent: Mutt/1.5.21 (2010-09-15)


On Fri, Jun 25, 2021 at 05:36:26PM -0400, Jake VanEck wrote:
> Any way to run datamash in parallel?

You can try GNU parallel (https://www.gnu.org/software/parallel/) or
xargs --max-procs to start several GNU datamash processes.

GNU parallel should give you more control over how to provide input to
the GNU datamash processes and might be a better fit than xargs.

I can only provide those pointers, because I have not used GNU parallel,
and I expect the compliacted part to be dividing one input stream into
several for independent processes, and then combining those to produce
the end result.

Do things that have never been done before.
                        -- Russell Kirsch

reply via email to

[Prev in Thread] Current Thread [Next in Thread]