Parallelisation (Bash)

Uit De Vliegende Brigade
Naar navigatie springen Naar zoeken springen
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

WordPress benaderen via de WP-CLI, is vreselijk langzaam. Gelukkig kun je dingen in Bash paralleliseren:

Parallelize a single command using '&'

An example:

Without parallelisation

Dit commando duurt ca. 5s.:

wp wc shop_order list --user=4 --field=id | xargs -n1 wp wc shop_order delete --user=4 --force=1

Ditzelfde commando op een VPS met 4 ipv 2 CPU's, duurt even lang (zelfs iets langer). Dat was geen verbazing: Dit PHP-commando is niet echt te paralleliseren, want het is één seriele aangelegenheid. Via tops en ps kon ik zien dat er wel degelijk aparte processen aan te pas komen. Zoiets als:

  • Commando-als-geheel
  • xargs
  • php
  • MySQL.

Maar nog steeds is er effectief geen sprake van parallelisatie, omdat deze processen welliswaar apart zijn, maar nog steeds seriëel worden doorlopen. Hetzelfde probleem als dat gamers meer geholpen zijn bij hele snelle processoren, dan bij veel processoren. Helaas kan ik bij TransIP niet voor snellere processoren kiezen. Alleen voor meer processoren - Lees verder!

With parallelisation

In Bash kun je met & aangeven dat het volgende commando kan starten voordat het huidige commando (waar de & bij hoort) beëindigt is. En daarmee blijk je prima te kunnen paralleliseren! Dit is dezelfde code als hierboven, ca. 100x toegepast:

# Parallel (2x)
########################################
#
date
	wp wc shop_order list --user=4 --field=id --per_page=50             | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=50 --offset=50 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
date
# Parallel (4x)
########################################
#
date
	wp wc shop_order list --user=4 --field=id --per_page=25             | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=25 --offset=25 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=25 --offset=50 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=25 --offset=75 | xargs -n1 wp wc shop_order delete --user=4 --force=1	
date
# Parallel (8x)
########################################
#
date
	wp wc shop_order list --user=4 --field=id --per_page=12             | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=12 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=24 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=36 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &	
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=48 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=60 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=72 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=84 | xargs -n1 wp wc shop_order delete --user=4 --force=1	
date
# Parallel (16x)
########################################
#
date
	wp wc shop_order list --user=4 --field=id --per_page=12              | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=12  | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=24  | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=36  | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=48  | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=60  | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=72  | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=84  | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=96  | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=108 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=120 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=132 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=144 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=156 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=168 | xargs -n1 wp wc shop_order delete --user=4 --force=1 &
	wp wc shop_order list --user=4 --field=id --per_page=12 --offset=180 | xargs -n1 wp wc shop_order delete --user=4 --force=1
date
  • The trick is that all lines - except the last one - have a &. As a result, the specified number of parallel commands are always executed
  • The last parallel command has no &. ie. the script will not continue until this last line has been executed. As a result, more-or-less the specified number of parallel commands are always active (this is not 100% efficient, but goes a long way)
  • If all lines have a & then there is no limit to the number of parallel commands. This gives two types of error messages: (1) IDs that don't seem to exist because another process has already deleted them (2) Out-of-sockets (or whatever it's called) for MySQL: No more IPCs (if that the correct term in) are set up
  • It is not a disaster to execute more commands in parallel than there are cores: Normally a computer does a lot of things in parallel anyway. Then this can also be added.
  • It is crucial that the workload of the original command can be split (in this case: 100x the same command but with different arguments). This is done using --per_page and --offset
  • Probably this code can be made more efficient by collecting all arguments in one command (that is one wp wc shop_order list command, and distributing the result over parallel wp wc shop_order delete commands.

Parallelize a chunk of code

Quite often, I would like to parallelize more than just an individual command. How to do that? For starters: This doesn't seem to have to do with the concept of subshells.

See also

See also

Sources