parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Help with input line replacement string -


From: Craig Carl
Subject: Help with input line replacement string -
Date: Sun, 1 Apr 2012 12:34:41 -0700

All -
I need a little help munging a string I need to pass thru parallel.
I'm using parallel to distribute some S3 download tasks using the
s3cmd. s3cmd takes a couple of options -

sc3md get <object to get> <path to put object>

<object to get> is easy, I'm having a hard time with <path to put object>

I get a list of objects to pipe to parallel like this -

#s3cmd ls --recursive s3://datasets.elasticmapreduce/ngrams/books/ |
awk '{ print $4}'

s3://datasets.elasticmapreduce/ngrams/books/20090715/chi-sim-all/1gram/data
s3://datasets.elasticmapreduce/ngrams/books/20090715/chi-sim-all/2gram/data
s3://datasets.elasticmapreduce/ngrams/books/20090715/chi-sim-all/3gram/data
s3://datasets.elasticmapreduce/ngrams/books/20090715/chi-sim-all/4gram/data

I pipe that to parallel like this -

#s3cmd ls --recursive s3://datasets.elasticmapreduce/ngrams/books/ |
awk '{print $4}' | parallel -j0 --sshloginfile hosts /usr/bin/s3cmd
--no-progress get {} <path to put object>

I using the above example I need <path to put object> to be -

./ngrams/books/20090715/chi-sim-all/1gram/data
./ngrams/books/20090715/chi-sim-all/2gram/data
./ngrams/books/20090715/chi-sim-all/3gram/data
./ngrams/books/20090715/chi-sim-all/4gram/data

An easy bit of bash will build the string, ${<object to
get>/s3\:\/\/datasets.elasticmapreduce/.} but I can't figure out how
to get that working with parallel. I've tried -

#s3cmd ls --recursive s3://datasets.elasticmapreduce/ngrams/books/ |
awk '{print $4}' | parallel -j0 --sshloginfile hosts /usr/bin/s3cmd
--no-progress get {} ${{}/s3\:\/\/datasets.elasticmapreduce/.}
#s3cmd ls --recursive s3://datasets.elasticmapreduce/ngrams/books/ |
awk '{print $4}' | parallel -j0 --sshloginfile hosts /usr/bin/s3cmd
--no-progress get {} "${"{}"/s3\:\/\/datasets.elasticmapreduce/.}"
#s3cmd ls --recursive s3://datasets.elasticmapreduce/ngrams/books/ |
awk '{print $4}' | parallel -j0 --sshloginfile hosts /usr/bin/s3cmd
--no-progress get {} '${'{}'/s3\:\/\/datasets.elasticmapreduce/.}'

Plus a couple of others, I get a "bad substitution error" no matter
what I try. I'm wondering if there is a way I could build the path as
part of the 's3cmd ls' command and then use {n} to get the path, but
I'm open to any suggestions.

Thanks,

Craig



reply via email to

[Prev in Thread] Current Thread [Next in Thread]