Hi,
first of all - thanks for parallel, it's a great tool!
I have a use case where I have a large workload that is generated dynamically and piping that input via stdin to parallel takes a long time (a couple of hours). Since it takes so long, I also really want to use --bar for seeing the progress and ETA, but then it fails with:
> parallel: Warning: Reading NNNN arguments took longer than 10 seconds
> parallel: Warning: Consider removing --bar
This is because if you specify --bar or --eta, it waits for collecting all the lines of the input to know the total number of jobs in order to calculate percentage & ETA.
In my case I do know the total upfront though, so I could simply pass that number into parallel. I did a quick patch and it works nicely:
- added a "--total N" option which expects an integer number
- if set, it would use that value preferably in the sub total_jobs() instead of counting the input (or any of the other cases it has)
- the rest follows automatically
Below is a quick patch against version "GNU parallel 20220822":
1706c1706,1707
< ("debug|D=s" => \$opt::D,
---
> ("total=i" => \$opt::total,
> "debug|D=s" => \$opt::D,
8818c8819,8821
< if($opt::sqlworker) {
---
> if($opt::total) {
> $self->{'total_jobs'} = $opt::total;
> } elsif($opt::sqlworker) {
Not sure what the right contribution process is, so I thought I would start with a mail on this list.
Cheers,
Alexander Klimetschek