GNU Parallel
Parallel is a shell routine to distribute a task over all available threads. It has been written by Ole Tange in Perl.
Some impressions of where parallel can help:
First line in chapter 1 of the manual reads:
If you write shell scripts to do the same processing for different input, then GNU Parallel will make your life easier and make your scripts run faster.
From the man page:
If you write loops in shell, you will find GNU parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel.
This article is as of Nov. 2022 a work in progress. That's why it looks like a brainstorm session, rather than a structured article.
Installation
$ sudo apt install parallel ... The following NEW packages will be installed: parallel sysstat ...
Example intro chapter 1
The intro of chapter 1 of the manual contains this example:
seq 5 | parallel seq {} '>' example.{}
What it does:
seq 5
produces a series with numbers from 1 to 5seq {}
this takes those numbers as arguments for creating additional sequences: A sequence with only the number 1, a sequence with the numbers 1 and 2, until a sequence with the numbers 1 to 5'>' example.{}
: These 5 sequences are written to filesexample.1
...example.5
.
An even simpler example, although not very useful:
seq 5 | paralel echo {}
Here, the five echo
commands are executed parallel.
Just call a function x times
This seems like such an easy start, but no. Very instructive, definitely:
Baseline: Without parallel stuff
Execution time: 24s.
# parallel_test_function() ######################################## # function parallel_test_function() { printf "parallel_test_function - Start... " i=0 for ((i; i<=1000000; i++)) do i=$i+1 i=$i-1 done printf "Done. " } # Main ######################################## # # Execution time (function: 1000000x. Here: 6x): 23, 24, 24, 24 ⇒ 24s # export -f parallel_test_function start=`date +%s` j=0 for ((j; j<=5; j++)) do parallel_test_function done end=`date +%s` echo ""; echo Execution time was `expr $end - $start` seconds.
Just call a function with 'parallel'?
This doesn't work:
# Main ######################################## # export -f parallel_test_function start=`date +%s` j=0 for ((j; j<=5; j++)) do parallel parallel_test_function done end=`date +%s` echo ""; echo Execution time was `expr $end - $start` seconds.
It will result in this error:
parallel: Warning: Input is read from the terminal. You either know what you parallel: Warning: are doing (in which case: YOU ARE AWESOME!) or you forgot parallel: Warning: ::: or :::: or to pipe data into parallel. If so parallel: Warning: consider going through the tutorial: man parallel_tutorial parallel: Warning: Press CTRL-D to exit.
Working or not?
This seems to work, but it isn't any faster:
# Call test function with parallel (v1) ######################################## # # * Execution time (function: 1000000x. Here: 6x): 21, 23, 24 ⇒ 24s # * With less inner loops and more outer loops, this is even slower than # without using || # export -f parallel_test_function start=`date +%s` j=0 for ((j; j<=5; j++)) do sem parallel_test_function done end=`date +%s` echo ""; echo Execution time was `expr $end - $start` seconds.
Finally!
Execution time: 9s - This works:
# Call test function with parallel (v2) ######################################## # # Execution time (function: 1000000x. Here: 6x): 10, 9, 9 ⇒ 9s # export -f parallel_test_function start=`date +%s` seq 6 | parallel parallel_test_function end=`date +%s` echo ""; echo Execution time was `expr $end - $start` seconds.
I suspect that the previous trial didn't work, because the loop kills parallelisation: Probably only what is stated after the keyword parallel
, is actually parallelised. Seems quite logical.
So the trick seems to be: When you have a loop and you want to || is, make sure that you get rid of the loop. A bit similar to changing a select query to a update query: Always a bit of puzzling, but doable.
See also
How to get parallel stuff into parallel?
I find it difficult to understand how to get stuff parallel into parallel. It seems like the same kind of difficulties I had with understanding SQL, which is a 4GL and does stuff implicitly. This section tries to give a bit of an overview.
One operator - Multiple arguments
Example of one operator with multiple arguments. In this case, the arguments are generated on the left of the command line, and piped into parallel. There is on
seq 10 | parallel echo {}
Is this the same as
seq 10 | parallel echo # Same as above?
It often seems that xargs
implicitly picks up where to insert the piped stuff. Same for parallel?
Multiple operators
Use a script file
parallel < my_script.sh
No difference between operators & arguments - Example Leo
[1]:
$ parallel "{1} {2}" ::: 'printf "%02d "' 'printf "%03d "' ::: 1 2 01 02 001 002
What it does:
printf "%02d "
: Print "00 "printf "%03d "
: Print "000 "
Arguments are multiplied into:
* printf "%02d " 1 * printf "%02d " 2 * printf "%03d " 1 * printf "%03d " 2
and this is executed through parallel.
Small detail: I get output as above, but when I run the print
commands separately, I get more '0's. Maybe has to do with the single quotes around the print statements?
Use a function
From Chapter 5 of the manual:
The command can be a script, a binary or a Bash function if the function is exported using export -f : my_func() { echo in my_func $1 } export -f my_func parallel my_func ::: 1 2 3
Note export -f
: Parallel operates within a subshell, and stuff from the invoking shell has to be made available in the subshell, if needed. See elswhere in the article for details.
Inline
It's possible to include multiple staments inline:
Multiple commands in pipeline
Appearantly, use $(cmd1; cmd2; cmd3)
to include multiple commands in a pipeline [2].
In combination with parallel:
seq 5 | parallel echo '$(j=$((2*{})); echo $(($j+100)))'
It looks a bit weird to have two echo
commands, but it works
:::
With :::
you provide parallel arguments without using a pipe.
First example
# "seq" and "5" are regarded as parallel arguments for echo ;) $ parallel echo ::: seq 5 seq 5
The reason why this returns the arguments seq
and 5
, rather than the result of executing seq 5
: It isn't clear that these arguments actually need to be executed! Put the arguments within $()
to get them evaluated before being passed to Parallel.
Second example
This works! Note that output from parallel is usually on multiple lines:
$ i=$(seq 5) $ echo $i $ parallel echo ::: $i 1 2 3 4 5 1 2 3 4 5
An array isn't parallel - Unless it is?
I have the impression that here, parallel doesn't treat array entries as parallel stuff, but the whole entry as just one argument:
j=(1 2 3 4 5) echo ${j[@]} seq 5 | parallel echo {} echo ${j[@]} | parallel echo {} seq 5 | parallel 'echo $(({}+{}))' echo ${j[@]} | parallel 'echo $(({}+{}))' # Error: Invalid arithmetic operator
But this works:
j=(1 2 3 4 5) parallel echo ::: ${j[@]}
Operate on entries before parallel?
Can you first operate on an entry before its being processed by parallel?
Example using :::
:
# First the argument is expanded and only then the operator applied # (hence to the last item only) # $ i=$(seq 5) $ parallel echo ::: $i+1 1 2 3 4 5+1
This seems more intuitive using a pipe, as individual arguments are available through {}
:
$ seq 5 | parallel 'echo $(({}+{}))' 2 4 6 8 10
Multiple operations on entries before parallel?
Concerning WP-CLI, it would be really cool if multiple commands can be run parallel, that each do something with the output.
Example:
- Have a sequence 1...5
- Do this in parallel for each of the numbers:
- Multiply an entry by 2
- Add 1 to the result.
Causation is important here: If 'adding 1' is done in parallel to 'multiply by 2', the results might become unpredictable.
Let's try:
seq 5 | parallel 'echo ((2*{}))' ...
Reuse argument
Casus that I encounter using WP-CLI sometimes:
- Have a sequence 1...5
- Do this in parallel for each of the numbers:
- Multiply entry by 2
- Multiply entry by 3
- Add the outcome of these two multiplications
Let's start with the last three lines and first make sure I get that part right :)
i=5 echo $((2*$i + 3*$i))
Now together:
seq 5 | parallel 'echo $((2*{} + 3*{}))'
And why stop here?
seq 5 | parallel 'echo "{} - $((2*{}+3*{}))"'
Evaluate parallel argument first
This won't work:
$ seq 20 | parallel echo $(({}+{})) bash: {}+{}: syntax error: operand expected (error token is "{}+{}")
The reason: The part between () is evaluated first, and only then it is interpreted as an argument for parallel
.
To change that, put the parallel argument between single quotes:
seq 20 | parallel 'echo $(({}+{}))'
BTW, this doesn't work:
$ seq 20 | parallel echo $(('{}'+'{}')) bash: '{}'+'{}': syntax error: operand expected (error token is "'{}'+'{}'")
Reusing an argument multiple times
You can use the parallel argument multiple times: Just use {}
multiple times:
seq 20 | parallel echo $(({}+{}))
sem
sem stands for semaphore, a token that is passed around to do stuff in parallel. I bumped into this hier tegen, but I am not sure it works for me like this within a loop. The Parallels manual doesn't seem to be exhaustive concerning this topic. I found https://www.gnu.org/software/parallel/sem.html a much better source.
How many concurrent threads?
Questions
- Is it about threads, processors, cores, sockets or what?
- Do I need to optimize myself for the number of threads? Or just leave this up to GNU Parallels?
- What are the effects for sub-optimized cases?
Answers
- What cylinders are in a car, are processors or processing units in a computer. See Processors, cores & threads on this computer (Bash) for details.
- GNU Parallel clearly knows what the optimal number of threads is. See below in the testcode for the case with
sem -j +0
: Here the number of threads is the same as the number of processors, and the statistics confirm this - When optimizing manually, rather choose a bit too high a number of threads, than too low. However, this very much depends on the use case. E.g.: If CPU power is the bottleneck or I/O - I'm quite sure that for me, it's usually CPU-power, though.
Test scripts
################################################################################ # Thread optimalisation ################################################################################ # # My laptop can do 8 threads. Let's see what happens to performance when I # force more or less threads: # # # parallel_test_function() ######################################## # function parallel_test_function() { printf "PTF - Start... " i=0 for ((i; i<=100000; i++)) do i=$i+1 i=$i-1 done printf "Done. " } export -f parallel_test_function # # Test - 8 threads # ######################################## # # # # * Execution time (s): 5, 5, 5, 5 ⇒ 5s # # # start=`date +%s` # # # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # sem -j 8 parallel_test_function # # # sem --wait # end=`date +%s` # echo ""; echo Execution time was `expr $end - $start` seconds. # Test - 16 threads ######################################## # # * Execution time (s): 5, 5, 5, 5, 5 ⇒ 5s # # start=`date +%s` # # # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # sem -j 16 parallel_test_function # # # sem --wait # end=`date +%s` # echo ""; echo Execution time was `expr $end - $start` seconds. # # Test - 32 threads # ######################################## # # # # * Execution time (s): 6, 6, 5, 5, 6 ⇒ 5.6s # # # start=`date +%s` # # # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # sem -j 32 parallel_test_function # # # sem --wait # end=`date +%s` # echo ""; echo Execution time was `expr $end - $start` seconds. # # Test - 4 threads # ######################################## # # # # * Execution time (s): 5, 6, 5, 6, 6, 5 ⇒ 5.5s # # # start=`date +%s` # # # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # sem -j 4 parallel_test_function # # # sem --wait # end=`date +%s` # echo ""; echo Execution time was `expr $end - $start` seconds. # # Test - 2 threads # ######################################## # # # # * Execution time (s): 7, 6, 6, 7, 7, 7 ⇒ 6.7s # # # start=`date +%s` # # # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # sem -j 2 parallel_test_function # # # sem --wait # end=`date +%s` # echo ""; echo Execution time was `expr $end - $start` seconds. # # Test - 1 thread # ######################################## # # # # * Execution time (s): 12, 12, 12, 12 ⇒ 12s # # # start=`date +%s` # # # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # sem -j 1 parallel_test_function # # # sem --wait # end=`date +%s` # echo ""; echo Execution time was `expr $end - $start` seconds. # Test - Auto-optimized ######################################## # # * Execution time (s): 5, 5, 5, 5, 5 ⇒ 5s # start=`date +%s` # sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function sem -j +0 parallel_test_function # sem --wait end=`date +%s` echo ""; echo Execution time was `expr $end - $start` seconds.
Sources
- https://unix.stackexchange.com/questions/114672/gnu-parallel-more-than-one-per-cpu
- https://www.gnu.org/software/parallel/sem.html
Subshells & variables
I have the impression that GNU Parallel creates a subshell and that precautions have to be taken to assure that functions and variables are available in that subshell.
Not that this subshell stuff is not the same as scope within a single shell
Without exported function or variable
################################################################################ # Subshell & vars? - Without exporting function or var ################################################################################ # # Function ######################################## # subfunction() { echo "Subfunction - Var j: $j" } # Main ######################################## # j=12 echo "Main - var j: $j" # parallel subfunction ::: $(seq 5)
Output:
Main - var j: 12 /bin/bash: subfunction: command not found /bin/bash: subfunction: command not found /bin/bash: subfunction: command not found /bin/bash: subfunction: command not found /bin/bash: subfunction: command not found
With exported function
Now the function is exported using export -f subfunction
and GNU Parallel can find it. However, the variable j
is not available within this function.
################################################################################ # Subshell & vars? - With exporting function ################################################################################ # # Function ######################################## # subfunction() { echo "Subfunction - Var j: $j" } # Main ######################################## # j=12 echo "Main - var j: $j" export -f subfunction # parallel subfunction ::: $(seq 5)
Output:
Main - var j: 12 Subfunction - Var j: Subfunction - Var j: Subfunction - Var j: Subfunction - Var j: Subfunction - Var j:
With exported function and exporter variable
Juhu! Sometimes, things are easy:
################################################################################ # Subshell & vars? - With exporting function ################################################################################ # # Function ######################################## # subfunction() { echo "Subfunction - Var j: $j" } # Main ######################################## # j=12 echo "Main - var j: $j" export -f subfunction export j # parallel subfunction ::: $(seq 5)
Output:
Main - var j: 12 Subfunction - Var j: 12 Subfunction - Var j: 12 Subfunction - Var j: 12 Subfunction - Var j: 12 Subfunction - Var j: 12
But not for arrays
It seems that regular arrays and associate arrays cannot be exported to subshells:
################################################################################ # Subshell, var & arrays ################################################################################ # # Function ######################################## # subfunction() { echo "function - Var i: $i" echo "function - Associative array j: ${j[@]}" echo "function - Regular array k: ${k[@]}" } # Main ######################################## # i=12 declare -gA j j[foo,1]="Foo-1" j[bar,2]="Bar-2" k[1]="K1" k[2]="K2" echo "Main - var j: $i" export -f subfunction export i export j # Doesn't work export j[@] # Doesn't work export k # Doesn't work export k[@] # Doesn't work export {k[@]} # Doesn't work # parallel subfunction ::: $(seq 5)
Output:
Main - var j: 12 ./parallel.sh: line 181: export: `j[@]': not a valid identifier ./parallel.sh: line 183: export: `k[@]': not a valid identifier ./parallel.sh: line 184: export: `{k[@]}': not a valid identifier function - Var i: 12 function - Associative array j: function - Regular array k: function - Var i: 12 function - Associative array j: function - Regular array k: function - Var i: 12 function - Associative array j: function - Regular array k: function - Var i: 12 function - Associative array j: function - Regular array k: function - Var i: 12 function - Associative array j: function - Regular array k:
case: collect term_ids through wp-cli
Seems like a good case for replacing a loop with parallel.
Original code:
# Collect all term_ids through a loop ####################################### # # * There are 1.433 terms to collect # * Max. 100 items are returned at once # * Hence this loop needs 15 iterations # i=1 echo "Loop - Collect all term_ids" # for ((i; i<=$number_of_iterations; i++)) do # echo " Iteration $i/$number_of_iterations" # # Store batch of term ids in tmp array j ######################################## # mapfile -t j < <( wp wc product_attribute_term list \ $taxonomy_id \ --user=4 \ --field=id \ --offset=$((($i-1)*100)) | grep . ) # # echo " j: ${j[@]}" # # Append to array term_id ######################################## # term_id=(${term_id[@]} ${j[@]}) echo " Length term_id: ${#term_id[@]}" # done
New code:
See also
Sources
- https://en.wikipedia.org/wiki/GNU_parallel
- https://www.gnu.org/software/parallel/
- https://zenodo.org/record/1146014/files/GNU_Parallel_2018.pdf?download=1
- https://bash-prompt.net/guides/parallell-bash/
- https://medium.com/linuxstories/bash-parallel-command-execution-d4bd7c7cc1d6
- https://adamtheautomator.com/how-to-speed-up-bash-scripts-with-multithreading-and-gnu-parallel/
- https://www.baeldung.com/linux/processing-commands-in-parallel
- https://www.msi.umn.edu/support/faq/how-can-i-use-gnu-parallel-run-lot-commands-parallel
- https://stackoverflow.com/questions/61483185/gnu-parallel-multiple-commands