Numerical field labels and --header-in

From: Dima Kogan
Subject: Numerical field labels and --header-in
Date: Thu, 19 May 2022 20:31:49 -0700
I'm patching stuff and writing emails about things I find while adding
vnlog suport. Here's another.

As we know, 'datamash --header-in' will read header names from the first
record, and will accept these names in references. As I just found out
(and as I'm guessing most people reading this don't know), these named
references are optional, and the numerical field indices still work. Not
only that, the numerical field indices have precedence. So if you have
this data:

  0    1   2
  1.1 2.2 3.3

Then 'datamash --header-in sum 1' returns 1.1 and NOT 2.2. This sucks.
If header names are available, those thould be the only way to reference

If somebody's thinking that the above example is an error-prone way to
label fields, then I don't disagree, but people do it. I've actually
seen vnlog users do this more than once. And there are more legitimate
use cases where you could have an integer field label, anyway.

There's a fix in my tree:


This isn't complete because it doesn't touch the tests, and there are
currently a lot of them that assume current behavior. What do we want to

For vnlog support, the logic in that patch is a requirement, but the
patch can be adjusted to apply to vnlog only, and that won't break
anybody's existing usage.

