Using Groups of Rows in a Parfor Loop

Question

0 votes

Below is my code that is attempting to populate a variable using section of rows. The original ContractFile is hundreds of thousands of rows - the thinking is I can populate the variable on different workers using the parfor loop which will populate sections of 10,000 rows at a time on a different worker. Example: Rows 1:10,000 to one worker, rows 10,001:20,000 to a different worker, etc. This code works as a regular for loop, but breaks as a parfor loop and I can't figure out why. Thanks!

parfor i = 1:Contracts
    Rows = (i-1)*10000+(1:10000);
    Var1(Rows,:) = ContractFile(Rows,2) .* ContractFile(Rows,8);
end

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Matt J le 1 Avr 2020

Ouvrir dans MATLAB Online

Why is the loop necessary? Why not simply,

Var1=ContractFile(:,2) .* ContractFile(:,8);

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

Matt J le 1 Avr 2020

Modifié(e) : Matt J le 1 Avr 2020

Ouvrir dans MATLAB Online

0 votes

As mentioned in my comment, your example does not make it clear why a loop is necessary at all. However, the reason for your difficulty is that your parfor code violates these rules. One way to fix it is as follows:

Var1=nan(10000,Contracts);
A=reshape(ContractFile(1:Contracts*10000,2),10000,[]);
B=reshape(ContractFile(1:Contracts*10000,8),10000,[]);
parfor i = 1:Contracts
    Var1(:,i) = A(:,i).*B(:,i);
end
Var1=Var1(:);

5 commentaires
Afficher 3 commentaires plus anciens Masquer 3 commentaires plus anciens

Matt J le 2 Avr 2020

Modifié(e) : Matt J le 2 Avr 2020

is there an easier way to explain why it wasn't working for the original code I posted.

You are attempting to treat Var1 and ContractFile as "sliced variables". When you index sliced variables, the indexing has to be a simple expression involving the loop variable. Using arbitrary index vectors like "Rows" is not allowed. From the documentation,

"Form of Indexing. Within the first-level of indexing for a sliced variable, exactly one indexing expression is of the form i, i+k, i-k, or k+i. The index i is the loop variable and k is a scalar integer constant or a simple (non-indexed) broadcast variable. Every other indexing expression is a positive integer constant, a simple (non-indexed) broadcast variable, a nested for-loop index variable, colon, or end."

It doesn't appear this code breaks up the rows into groups to be utilized.

By reshaping the data into 10000xN matrices, the batches of data you are looking for become separate matrix columns, and can be indexed in the form A(:,i), B(:,i) and Var(:,i), which is legal for sliced variables according to the above rule.

What I'm hoping is that I could solve for Var1 by having:

This would happen if the number of workers N is equal to the number of loop iterations "Contracts". Otherwise, it will split the loop into N smaller loops and each worker will process a consecutive chunk of Contracts/N iterations.

Derek De Vries le 2 Avr 2020

Modifié(e) : Derek De Vries le 2 Avr 2020

I apologize, I misunderstood how this code was working and it works great - so thank you!

A couple of follow up questions:

How does the final line of code execute? It appears to me that the code is just saying make Var1 equal all rows of the current Var1 variable. So I would think it would remain a 10,000xN matrix. How did it know to bring all of those columns in 2:N to the row 10,001 then 20,001, etc?
If the last "block" of rows isn't the full amount of exactly 10,000 (Example: There are 123,456 total rows in the Contract File) is there a way to code that the final grouping should be 3,456 rows instead of 10,000 inside of the parfor loop or would that piece have to just be performed outside of the loop at the end?

Matt J le 2 Avr 2020

Modifié(e) : Matt J le 2 Avr 2020

Ouvrir dans MATLAB Online

1. Var1=Var1(:) is equivalent to Var1=reshape(Var1,[],1). See also

https://www.mathworks.com/help/matlab/ref/colon.html#description

2. You could pre-pad the array (e.g., with NaNs) to have an even multiple of 10000 elements and then proceed as above. Or, instead of reshaping, you could also partition your input data into cells and use cell array indexing. To do this splitting, I recommend mat2tiles (Download) because it will handle non-even multiples of 10000 rows, as demonstrated in the code below.

 CF=mat2tiles(ContractFile,[10000,Inf]);
 N=numel(C);
 
 Var1=cell(numel(C),1);
 
 parfor i=1:N
     
     Var1{i}=CF{i}(:,2).*CF{i}(:,8);
     
 end
 
 Var1=cell2mat(Var1);

Connectez-vous pour commenter.

Using Groups of Rows in a Parfor Loop

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Réponse acceptée

5 commentaires
Afficher 3 commentaires plus anciens Masquer 3 commentaires plus anciens

Plus de réponses (0)

Catégories

Produits

Version

Tags

Community Treasure Hunt

Using Groups of Rows in a Parfor Loop

1 commentaire Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Réponse acceptée

5 commentaires Afficher 3 commentaires plus anciens Masquer 3 commentaires plus anciens

Plus de réponses (0)

Catégories

Produits

Version

Tags

Voir également

Community Treasure Hunt

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

5 commentaires
Afficher 3 commentaires plus anciens Masquer 3 commentaires plus anciens