I am slowly going through the tutorials and looking at the --retries and --filter-hosts.
I would like to engineer a recovery-oriented robust parallel job that retries tasks on "the next" (usually different) host in a round-robin when a task times out or fails.
I have 10 hosts with the same NFS mounted file system and I want to dispatch manifests of tens of thousands of files to the 10 hosts round-robin style. If a manifest job fails or times out, I want to re-try the job on a different host and continue until all the tasks complete.
Can gnu parallel be used in such a way that it retries jobs on different hosts?
find -type f -name "*manifest*" | parallel . . .