bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16468: join


From: Bernhard Voelker
Subject: bug#16468: join
Date: Fri, 17 Jan 2014 01:00:10 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

On 01/16/2014 07:10 PM, Eric Blake wrote:
> On 01/16/2014 10:46 AM, barry kesner wrote:
>>   How do you tell join this without resorting.  The files are huge!
> 
> Unfortunately, there isn't any really good way, short of re-processing
> the files to make the data appear sorted in the order join expects.
> That said, it certainly appears that for your given data, you can write
> a sed filter that can reprocess on a line-by-line basis, and feed that
> into join, without the penalty of having to re-sort the entire file and
> without having to have the processed file stored in your file system all
> at once.  It also seems possible to write a post filter to get back to
> the style of the line in the original file.  Here, extensions such as bash's
>   join <(infilter file1) <(infilter file2) | outfilter
> make it easier to type (where the trick is to now write the correct sed
> scripts to serve as infilter and outfilter) than the alternative of
> having to use named fifos for limiting yourself to just POSIX semantics.

Hum, isn't such number conversion filtering exactly what numfmt
wasn't designed for?  But wait ...

  $ numfmt --field 1 --format='%020f' < f2
              99980081    1
             100002129   1
             100002136   2
             100002162   3

... it doesn't support leading zeros, unfortunately. ;-/
Wouldn't this be a nice enhancement?

Have a nice day,
Berny





reply via email to

[Prev in Thread] Current Thread [Next in Thread]