bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Numerical field labels and --header-in


From: Dima Kogan
Subject: Numerical field labels and --header-in
Date: Thu, 19 May 2022 20:31:49 -0700
User-agent: mu4e 1.6.10; emacs 29.0.50

I'm patching stuff and writing emails about things I find while adding
vnlog suport. Here's another.

As we know, 'datamash --header-in' will read header names from the first
record, and will accept these names in references. As I just found out
(and as I'm guessing most people reading this don't know), these named
references are optional, and the numerical field indices still work. Not
only that, the numerical field indices have precedence. So if you have
this data:

  0    1   2
  1.1 2.2 3.3

Then 'datamash --header-in sum 1' returns 1.1 and NOT 2.2. This sucks.
If header names are available, those thould be the only way to reference
fields.

If somebody's thinking that the above example is an error-prone way to
label fields, then I don't disagree, but people do it. I've actually
seen vnlog users do this more than once. And there are more legitimate
use cases where you could have an integer field label, anyway.

There's a fix in my tree:

  
https://github.com/dkogan/datamash/commit/76080a51f2dda27734d32fbb6aae5b85f1530c5b

This isn't complete because it doesn't touch the tests, and there are
currently a lot of them that assume current behavior. What do we want to
do?

For vnlog support, the logic in that patch is a requirement, but the
patch can be adjusted to apply to vnlog only, and that won't break
anybody's existing usage.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]