bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: datamash performance question


From: Jake VanEck
Subject: Re: datamash performance question
Date: Fri, 25 Jun 2021 16:41:02 -0400

I've tried similar commands but doesn't awk need to put the entire dataset into memory for this? If so, the data is far larger than memory can handle but I'll test it out anyway

Thanks for the suggestion

-Jake

On Fri, Jun 25, 2021, 4:38 PM Dima Kogan <dima@secretsauce.net> wrote:
I'd be curious to see ways to make datamash work faster here, but if all
you're doing is computing sums, you can use awk instead of datamash:

  mawk '{s[$2] += $1;} END { for (k in s) {print k, s[k]; } }'

Then you can probably also get rid of your sort and sed calls.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]