bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-datamash] Re; New feature: extended identifiers


From: Assaf Gordon
Subject: [Bug-datamash] Re; New feature: extended identifiers
Date: Wed, 5 Oct 2016 10:46:28 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0

Hello Stefan,

Sorry for the delayed response.

regarding your suggestion in:
http://lists.gnu.org/archive/html/bug-datamash/2016-09/msg00006.html

Now comes in my issue with 'datamash': my dynamically generated
tables specify column names by the pattern '[a-z0-9-]+', as already
suggested above. Currently, passing in 'total-node-mem' as a field to
'datamash', makes the program to exit with error (saying that field
ranges must be numeric).

I like this idea, thanks for the patch!

I think this could be made the default, without the need for additional option.
We just need to make it well defined (since quotes are also used for shell 
quoting).

Then we document it, ensure it doesn't cause any regressions with the current 
tests,
and add new tests for this new behavior.

IIUC, you want quotes (single and double) to allow minus characters
(and in fact, any other character?) in a token, which will later be used as a 
field name.

Currently, datamash rejects field names with dashes as:

   $ datamash -H sum total-node-mem
   datamash: field range for ‘sum’ must be numeric

However, simple quotes won't help, as they will be discarded by the shell:

   $ datamash -H sum "total-node-mem"
   datamash: field range for ‘sum’ must be numeric

   $ datamash -H sum 'total-node-mem'
   datamash: field range for ‘sum’ must be numeric

Which means it will either require awkward double quoting:

   $ datamash -H sum "'total-node-mem'"

Or slightly less awkward, with the entire 'program' as a string:

   $ datamash -H "sum 'total-node-mem'"

Both of these are not intuitive (definitely so for less savvy unix users).
It will be tricky to explain this in the documentation,
as saying "use quotes for fields with minus charactesr" is incorrect and 
insufficient,
and then we'll need to go into explaining shell quotes.

Perhaps we should consider another escaping scheme?
one that's easy to type, and does not conflict with other possible shell 
characters?
Something like this (just a thought, not necessarily the optimal solution):

   datamash -H sum {total-node-mem}



regards,
 - assaf




reply via email to

[Prev in Thread] Current Thread [Next in Thread]