Just call a function multiple times (GNU Parallel)
This was one of my first trials with GNU Parallel and I though it was quite simple: Execute the same function 6 times.
Seemed like an easy start, but it took quite some time to get it working.
- The 'CPU consuming' part is a loop with an integer addition and a subtraction. I didn't want to use something with
sleep
, as that might not actually take up CPU resources - Execution time is mentioned for various implementations. I think this was on my laptop, but that's besides the point. The essence is being able to compare the results.
Baseline: Without parallel stuff
Execution time: 24s.
# parallel_test_function() ######################################## # function parallel_test_function() { printf "parallel_test_function - Start... " i=0 for ((i; i<=1000000; i++)) do i=$i+1 i=$i-1 done printf "Done. " } # Main ######################################## # # Execution time (function: 1000000x. Here: 6x): 23, 24, 24, 24 ⇒ 24s # export -f parallel_test_function start=`date +%s` j=0 for ((j; j<=5; j++)) do parallel_test_function done end=`date +%s` echo ""; echo Execution time was `expr $end - $start` seconds.
Just call a function with 'parallel'?
This doesn't work:
# Main ######################################## # export -f parallel_test_function start=`date +%s` j=0 for ((j; j<=5; j++)) do parallel parallel_test_function done end=`date +%s` echo ""; echo Execution time was `expr $end - $start` seconds.
It will result in this error:
parallel: Warning: Input is read from the terminal. You either know what you parallel: Warning: are doing (in which case: YOU ARE AWESOME!) or you forgot parallel: Warning: ::: or :::: or to pipe data into parallel. If so parallel: Warning: consider going through the tutorial: man parallel_tutorial parallel: Warning: Press CTRL-D to exit.
Working or not?
This seems to work, but it isn't any faster:
# Call test function with parallel (v1) ######################################## # # * Execution time (function: 1000000x. Here: 6x): 21, 23, 24 ⇒ 24s # * With less inner loops and more outer loops, this is even slower than # without using || # export -f parallel_test_function start=`date +%s` j=0 for ((j; j<=5; j++)) do sem parallel_test_function done end=`date +%s` echo ""; echo Execution time was `expr $end - $start` seconds.
Finally!
Execution time: 9s - This works:
# Call test function with parallel (v2) ######################################## # # Execution time (function: 1000000x. Here: 6x): 10, 9, 9 ⇒ 9s # export -f parallel_test_function start=`date +%s` seq 6 | parallel parallel_test_function end=`date +%s` echo ""; echo Execution time was `expr $end - $start` seconds.
I suspect that the previous trial didn't work, because the loop kills parallelisation: Probably only what is stated after the keyword parallel
, is actually parallelised. Seems quite logical (except for using sem
- would this reasoning still hold?)
So the trick seems to be: When you have a loop and you want to || is, make sure that you get rid of the loop. A bit similar to changing a select query to a update query: Always a bit of puzzling, but doable.