bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Suggestion: add the possibility to apply multiple operations to a si


From: Tomas Peitl
Subject: Re: Suggestion: add the possibility to apply multiple operations to a single column (or multiple columns)
Date: Mon, 7 Nov 2022 09:30:29 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2

Hi Tim,

Thanks for the reply.

The main thing to be careful of with a "mean,max,count"-style operation is how it would interact with groupby or crosstab. Eg I wonder if "datamash groupby 1 mean,max,count 2" makes sense in any way.

Ranges like 1-2,4 could be less straightforward, especially when combined with the former idea of providing multiple operations simultaneously. When preparing a test for "mean,max,count 1-2,4", should the test output columns like "mean_1, max_1, count_1, mean_2, max_2, count_2, mean_4, max_4, count_4", or "mean_1, max_1, count_1, mean_2, max_2, count_2, mean_4, max_4, count_4", or something else?

Good points, I didn't even realize you would have to make a decision here. Perhaps it's more natural to think of 'mean,max' as a single combined operation, i.e. the ordering mean 1 max 1 mean 2 max 2, but I also originally had the other ordering in mind. But in any case, if this is what you need and you type it out verbosely, you still have to make that same decision, only here we would have to make a default decision for every case. There could even be a command-line switch to toggle the ordering.

Is there any chance you could provide a preliminary patch and tests which would get the ball rolling? You could break it up into two patches, one for adding column ranges, and one for "lambda-ing" multiple operations over a column.

Perhaps, but can't make any promises either.

Cheers,
Tomas




reply via email to

[Prev in Thread] Current Thread [Next in Thread]