Nested parfor
and for
-Loops and Other
parfor
Requirements
Nested parfor
-Loops
You cannot use a parfor
-loop inside another
parfor
-loop. As an example, the following nesting of
parfor
-loops is not allowed:
parfor i = 1:10 parfor j = 1:5 ... end end
Tip
You cannot nest parfor
directly within another
parfor
-loop. A parfor
-loop can
call a function that contains a parfor
-loop, but you do
not get any additional parallelism.
Code Analyzer in the MATLAB® Editor flags the use of parfor
inside another
parfor
-loop:
You cannot nest parfor
-loops because
parallelization can be performed at only one level. Therefore, choose which loop to
run in parallel, and convert the other loop to a for
-loop.
Consider the following performance issues when dealing with nested loops:
Parallel processing incurs overhead. Generally, you should run the outer loop in parallel, because overhead only occurs once. If you run the inner loop in parallel, then each of the multiple
parfor
executions incurs an overhead. See Convert Nested for-Loops to parfor-Loops for an example how to measure parallel overhead.Make sure that the number of iterations exceeds the number of workers. Otherwise, you do not use all available workers.
Try to balance the
parfor
-loop iteration times.parfor
tries to compensate for some load imbalance.
Tip
Always run the outermost loop in parallel, because you reduce parallel overhead.
You can also use a function that uses parfor
and embed it in a
parfor
-loop. Parallelization occurs only at the outer level.
In the following example, call a function MyFun.m
inside the
outer parfor
-loop. The inner parfor
-loop
embedded in MyFun.m
runs sequentially, not in
parallel.
parfor i = 1:10 MyFun(i) end function MyFun(i) parfor j = 1:5 ... end end
Tip
Nested parfor
-loops generally give you no computational
benefit.
Convert Nested for
-Loops to parfor
-Loops
A typical use of nested loops is to step through an array using a one-loop variable to index one dimension, and a nested-loop variable to index another dimension. The basic form is:
X = zeros(n,m); for a = 1:n for b = 1:m X(a,b) = fun(a,b) end end
The following code shows a simple example. Use tic
and
toc
to measure the computing time needed.
A = 100; tic for i = 1:100 for j = 1:100 a(i,j) = max(abs(eig(rand(A)))); end end toc
Elapsed time is 49.376732 seconds.
You can parallelize either of the nested loops, but you cannot run both in parallel. The reason is that the workers in a parallel pool cannot start or access further parallel pools.
If the loop counted by i
is converted to a
parfor
-loop, then each worker in the pool executes the nested
loops using the j
loop counter. The j
loops
themselves cannot run as a parfor
on each worker.
Because parallel processing incurs overhead, you must choose carefully whether you
want to convert either the inner or the outer for
-loop to a
parfor
-loop. The following example shows how to measure the
parallel overhead.
First convert only the outer
for
-loop to a parfor
-loop. Use
tic
and toc
to measure the computing
time needed. Use ticBytes
and tocBytes
to
measure how much data is transferred to and from the workers in the parallel pool.
Run the new code, and run it again. The first run is slower than subsequent runs, because the parallel pool takes some time to start and make the code available to the workers.
A = 100; tic ticBytes(gcp); parfor i = 1:100 for j = 1:100 a(i,j) = max(abs(eig(rand(A)))); end end tocBytes(gcp) toc
BytesSentToWorkers BytesReceivedFromWorkers __________________ ________________________ 1 32984 24512 2 33784 25312 3 33784 25312 4 34584 26112 Total 1.3514e+05 1.0125e+05 Elapsed time is 14.130674 seconds.
Next convert only the inner loop to a
parfor
-loop. Measure the time needed and data transferred as
in the previous
case.
A = 100; tic ticBytes(gcp); for i = 1:100 parfor j = 1:100 a(i,j) = max(abs(eig(rand(A)))); end end tocBytes(gcp) toc
BytesSentToWorkers BytesReceivedFromWorkers __________________ ________________________ 1 1.3496e+06 5.487e+05 2 1.3496e+06 5.4858e+05 3 1.3677e+06 5.6034e+05 4 1.3476e+06 5.4717e+05 Total 5.4144e+06 2.2048e+06 Elapsed time is 48.631737 seconds.
If you convert the inner loop to a
parfor
-loop, both the time and amount of data transferred are
much greater than in the parallel outer loop. In this case, the elapsed time is
almost the same as in the nested for
-loop example. The speedup is
smaller than running the outer loop in parallel, because you have more data transfer
and thus more parallel overhead. Therefore if you execute the
inner loop in parallel, you get no computational benefit
compared to running the serial for
-loop.
If you want to reduce parallel overhead and speed up your computation, run the outer loop in parallel.
If you convert the inner loop instead, then each iteration of
the outer loop initiates a separate parfor
-loop. That is, the
inner loop conversion creates 100 parfor
-loops. Each of the
multiple parfor
executions incurs overhead. If you want to reduce
parallel overhead, you should run the outer loop in parallel instead, because
overhead only occurs once.
Tip
If you want to speed up your code, always run the outer loop in parallel, because you reduce parallel overhead.
Nested for
-Loops: Requirements and Limitations
If you want to convert a nested for
-loop to a
parfor
-loop, you must ensure that your loop variables are
properly classified, see Troubleshoot Variables in parfor-Loops. If your code does not adhere to the guidelines and restrictions labeled as
Required, you get an error. MATLAB catches some of these errors at the time it reads the code. These
errors are labeled as Required (static).
Required (static): You must
define the range of a for -loop nested in a
parfor -loop by constant numbers or broadcast
variables. |
In the following example, the code on the left does not work because you define
the upper limit of the for
-loop by a function call. The code on
the right provides a workaround by first defining a broadcast or constant variable
outside the parfor
-loop:
Invalid | Valid |
---|---|
A = zeros(100, 200); parfor i = 1:size(A, 1) for j = 1:size(A, 2) A(i, j) = i + j; end end |
A = zeros(100, 200); n = size(A, 2); parfor i = 1:size(A,1) for j = 1:n A(i, j) = i + j; end end |
Required (static): The index variable for the
nested for -loop must never be explicitly assigned
other than by its for statement. |
Following this restriction is required. If the nested for
-loop
variable is changed anywhere in a parfor
-loop other than by its
for
statement, the region indexed by the
for
-loop variable is not guaranteed to be available at each
worker.
The code on the left is not valid because it tries to modify the value of the
nested for
-loop variable j
in the body of
the loop. The code on the right provides a workaround by assigning the nested
for
-loop variable to a temporary variable
t
, and then updating t
.
Invalid | Valid |
---|---|
A = zeros(10); parfor i = 1:10 for j = 1:10 A(i, j) = 1; j = j+1; end end |
A = zeros(10); parfor i = 1:10 for j = 1:10 A(i, j) = 1; t = j; t = t + 1; end end |
Required
(static): You cannot index or subscript a nested
for -loop variable. |
Following this restriction is required. If a nested for
-loop
variable is indexed, iterations are not guaranteed to be independent.
The example on the left is invalid because it attempts to index the nested
for
-loop variable j
. The example on the
right removes this indexing.
Invalid | Valid |
---|---|
A = zeros(10); parfor i = 1:10 for j = 1:10 j(1); end end |
A = zeros(10); parfor i = 1:10 for j = 1:10 j; end end |
Required (static): When using
the nested for -loop variable for indexing a
sliced array, you must use the variable in plain form, not as part
of an expression. |
For example, the following code on the left does not work, but the code on the right does:
Invalid | Valid |
---|---|
A = zeros(4, 11); parfor i = 1:4 for j = 1:10 A(i, j + 1) = i + j; end end |
A = zeros(4, 11); parfor i = 1:4 for j = 2:11 A(i, j) = i + j - 1; end end |
Required (static): If you use a
nested for -loop to index into a sliced array,
you cannot use that array elsewhere in the
parfor -loop. |
In the following example, the code on the left does not work because
A
is sliced and indexed inside the nested
for
-loop. The code on the right works because
v
is assigned to A
outside of the nested
loop:
Invalid | Valid |
---|---|
A = zeros(4, 10); parfor i = 1:4 for j = 1:10 A(i, j) = i + j; end disp(A(i, j)) end |
A = zeros(4, 10); parfor i = 1:4 v = zeros(1, 10); for j = 1:10 v(j) = i + j; end disp(v(j)) A(i, :) = v; end |
parfor
-Loop Limitations
Nested Functions
The body of a parfor
-loop cannot reference a nested
function. However, it can call a nested function by a function handle. Try the
following example. Note that A(idx) = nfcn(idx)
in the
parfor
-loop does not work. You must use
feval
to invoke the fcn
handle in
the parfor
-loop
body.
function A = pfeg function out = nfcn(in) out = 1 + in; end fcn = @nfcn; parfor idx = 1:10 A(idx) = feval(fcn, idx); end end
>> pfeg Starting parallel pool (parpool) using the 'Processes' profile ... connected to 4 workers. ans = 2 3 4 5 6 7 8 9 10 11
Tip
If you use function handles that refer to nested functions inside a
parfor
-loop, then the values of externally scoped
variables are not synchronized among the workers.
Nested parfor
-Loops
The body of a parfor
-loop cannot contain a
parfor
-loop. For more information, see Nested parfor-Loops.
Nested spmd
Statements
The body of a parfor
-loop cannot contain an
spmd
statement, and an spmd
statement cannot contain a parfor
-loop. The reason is that
workers cannot start or access further parallel pools.
break
and return
Statements
The body of a parfor
-loop cannot contain
break
or return
statements.
Consider parfeval
or parfevalOnAll
instead, because
you can use cancel
on them.
Global and Persistent Variables
The body of a parfor
-loop cannot contain
global
or persistent
variable
declarations. The reason is that these variables are not synchronized between
workers. You can use global
or
persistent
variables within functions, but their value
is visible only to the worker that creates them. Instead of
global
variables, it is a better practice to use function
arguments to share values.
To learn more about variable requirements, see Troubleshoot Variables in parfor-Loops.
Scripts
If a script introduces a variable, you cannot call this script from within a
parfor
-loop or spmd
statement. The
reason is that this script would cause a transparency violation. For more
details, see Ensure Transparency in parfor-Loops or spmd Statements.
Anonymous Functions
You can define an anonymous function inside the body of a
parfor
-loop. However, sliced output variables inside
anonymous functions are not supported. You can work around this by using a
temporary variable for the sliced variable, as shown in the following
example.
x = 1:10; parfor i=1:10 temp = x(i); anonymousFunction = @() 2*temp; x(i) = anonymousFunction() + i; end disp(x);
For more information on sliced variables, see Sliced Variables.
inputname
Functions
Using inputname
to return the workspace variable name
corresponding to an argument number is not supported inside
parfor
-loops. The reason is that
parfor
workers do not have access to the workspace of the
MATLAB desktop. To work around this, call inputname
before parfor
, as shown in the following
example.
a = 'a'; myFunction(a) function X = myFunction(a) name = inputname(1); parfor i=1:2 X(i).(name) = i; end end
load
Functions
The syntaxes of load
that do not assign to an output
structure are not supported inside parfor
-loops. Inside
parfor
, always assign the output of
load
to a structure.
nargin
or nargout
Functions
The following uses are not supported inside parfor
-loops:
Using
nargin
ornargout
without a function argumentUsing
narginchk
ornargoutchk
to validate the number of input or output arguments in a call to the function that is currently executing
The reason is that workers do not have access to the workspace of the
MATLAB desktop. To work around this, call these functions before
parfor
, as shown in the following
example.
myFunction('a','b') function X = myFunction(a,b) nin = nargin; parfor i=1:2 X(i) = i*nin; end end
P-Code Scripts
You can call P-code script files from within a parfor
-loop,
but P-code scripts cannot contain a parfor
-loop. To work
around this, use a P-code function instead of a P-code script.
See Also
parfor
| parfeval
| parfevalOnAll