bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Fixed incomplete and incorrect treatment of comments and tra


From: Dima Kogan
Subject: Re: [PATCH] Fixed incomplete and incorrect treatment of comments and trailing whitespace
Date: Fri, 27 May 2022 20:58:26 -0700
User-agent: mu4e 1.6.10; emacs 29.0.50

Tim Rice <trice@posteo.net> writes:

> A more correct analogy is to consider how Awk handles comments *in the
> data consumed by Awk*, not within Awk scripts.

AWK doesn't have a concept of comments in the data it parses.


> Still, even though we can't roll it back, at least we shan't compound
> the misstep by making comments even more complicated.

I disagree, but I also don't particularly care what datamash does by
default. In --vnlog mode it should suppress trailing comments.


> That is,
>
> bar 5
> bbb   The second line has trailing spaces. At the moment, Datamash handles 
> this
> in a way that is arguably correct:
> 
> ```
> $ ./datamash -W transpose < ~/tmp/testing.txt
> bar     bbb
> 5
```

The patch adds a test to the test suite to flag the faulty behavior, and
to demonstrate that it has been fixed. See that for detail.


> Furthermore, if trailing whitespaces are a problem for you, they can
> easily be removed by sed. I'm not convinced that datamash should need
> to handle all aspects of cleaning up messy data.

So, the whole point of datamash and vnlog and all the others is to be
convenient. I can do everything with AWK, or with C code, and nobody
NEEDS any of these tools. Handling obviously-quirky input in the obvious
way improves the convenience of datamash, and we should strive to do
that. But again, I don't care what datamash does without --vnlog. With
--vnlog, it should do what my patches do.

Once you decide what you want the base code to be, I'll rebase my
patches on top, to do the extra stuff with --vnlog. Kinda busy right
now. Probably will get to it in a week or two.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]