[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
The difference between "parallel cat >> file" and "parallel “cat >> file
From: |
Nan Xiao |
Subject: |
The difference between "parallel cat >> file" and "parallel “cat >> file”" |
Date: |
Fri, 11 Mar 2016 09:48:51 +0800 |
Hi all,
make sure whether my understanding between "ls | parallel -m -j $f “cat {} >> ../transactions_cat/transactions.csv”" and
"ls | parallel -m -j $f cat {} >> ../transactions_cat/transactions.csv" is right:
(1) ls | parallel -m -j $f “cat {} >> ../transactions_cat/transactions.csv”
In this case, the job should be:
job 1: cat file1 >> ../transactions_cat/transactions.csv
job 2: cat file2 >> ../transactions_cat/transactions.csv
job 3: cat file3 >> ../transactions_cat/transactions.csv
......
Since the output to "../transactions_cat/transactions.csv" belongs to the job, it is out of GNU Parallel's control. So there exists
the contention issue that multiple processes write to the same file currently, may be a lock is needed.
(2) ls | parallel -m -j $f cat {} >> ../transactions_cat/transactions.csv
In this case, the job should be:
job 1: cat file1
job 2: cat file2
job 3: cat file3
......
since the output to "../transactions_cat/transactions.csv" is parallel's responsibility, it is in GNU Parallel's control. The GNU parallel
can buffer the output of every job, and write them to "../transactions_cat/transactions.csv" one by one, so this can make sure the output
of different jobs can't mix up.
Do I understand right? If not, could someone give some corrections?
Thanks in advance!
- The difference between "parallel cat >> file" and "parallel “cat >> file”",
Nan Xiao <=