bug-parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Cannot specify the number of threads for parsort


From: Mario Roy
Subject: Re: Cannot specify the number of threads for parsort
Date: Thu, 9 Feb 2023 04:12:51 -0600

Aloha again,

Some context. I gave GNU Parallel a try at PerlMonks. See thread.

Re: Rosetta Code: Long List is Long - GNU Parallel
https://perlmonks.org/?node_id=11150254

Blessings and grace,
  - Mario


On Thu, Feb 9, 2023 at 1:24 AM Mario Roy <marioeroy@gmail.com> wrote:
Aloha,

$ parallel --version
GNU parallel 20230122

This is a wish list for allowing one to specify the number of threads via an ENVIRONMENT variable that works consistently using parallel or parsort. Basically, I want to specify the number of threads for parsort regardless of processing files specified via command-line arguments or STDIN.

In the meantime, I created a wrapper script that is placed in a path (/usr/local/bin) before (/usr/bin) where parallel resides.

#!/usr/bin/env bash
# Wrapper script for parallel.

# Whoa!!! GNU Parallels assumes you want to consume all CPU cores.
# Unfortunately, one cannot specify the number of threads for parsort.

CMD="/usr/bin/parallel"

if [[ -z "$PARALLEL_NUM_THREADS" ]]; then
  exec "$CMD" "$@"

elif [[ "$#" -eq 1 && "$1" == "--number-of-threads" ]]; then
  echo $PARALLEL_NUM_THREADS; exit 0

elif [[ "$1" == "-j" ]]; then
  shift; shift; exec "$CMD" -j $PARALLEL_NUM_THREADS "$@"

else
  exec "$CMD" -j $PARALLEL_NUM_THREADS "$@"

fi



Use case:

export PARALLEL_NUM_THREADS=6

LC_ALL=C parsort -k1 big{1,2,3}.txt | tally-count | LC_ALL=C parsort -k2nr >out.txt

cat big{1,2,3}.txt | LC_ALL=C parsort -k1 | tally-count | LC_ALL=C parsort -k2nr >out.txt

The big files are two column key-value pairs delimited by a tab. The output contains duplicate key names.
The tally-count command sums adjacent count fields of duplicate key names. The output contains unique key names.
Then sorted by sum descending order, keyname ascending order.


Blessings and grace,
  - Mario



reply via email to

[Prev in Thread] Current Thread [Next in Thread]