# How to parallelize MATLAB function on the CPU

1 vue (au cours des 30 derniers jours)
SH le 10 Mar 2023
I have a MATLAB function and a dataset that I would like to run in parallel on the CPU. Specifically, I would like to compare the performance of running the function with and without parallelization
I also attached my Dataset below
How Can i do that in MATLAB
clusternumber = 6;
function [Score] = Scorefunction(Dataset,clusternumber)
dataset_len = size(Dataset,1);
Score = zeros(1,clusternumber);
for j=1:clusternumber
[cluster_assignments,centroids] = kmeans(Dataset,j);
distance_within=zeros(dataset_len,1);
distance_between=Inf(dataset_len,j);
for i=1:dataset_len
for jj=1:j
boo=cluster_assignments==cluster_assignments(i);
Xsamecluster=Dataset(boo,:);
if size(Xsamecluster,1)>1
distance_within(i)=sum(sum((Dataset(i,:)-Xsamecluster).^2,2))/(size(Xsamecluster,1)-1);
end
boo1= cluster_assignments~=cluster_assignments(i);
Xdifferentcluster=Dataset(boo1 & cluster_assignments ==jj,:);
if ~isempty(Xdifferentcluster)
distance_between(i,jj)=mean(sum((Dataset(i,:)-Xdifferentcluster).^2,2));
end
end
end
minavgDBetween = min(distance_between, [], 2);
silh = (minavgDBetween - distance_within) ./ max(distance_within,minavgDBetween);
Score(j) =mean(silh);
end
end
##### 1 commentaireAfficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens
SH le 10 Mar 2023

Connectez-vous pour commenter.

### Réponses (2)

Arka le 10 Mar 2023
Hi,
You can benefit from parallel computing by using the functions provided in the Parallel Computing Toolbox.
I have modified the code to include the required functions:
clusternumber = 6;
% Create independent job on cluster
c = parcluster;
job = createJob(c);
% Create new task in the job
%for i = 1:10
%end
submit(job); % run the job
wait(job); % wait for the job to end
results = fetchOutputs(job); % get the results
disp(results);
delete(job); % delete the job
function [Score] = Scorefunction(Dataset,clusternumber)
dataset_len = size(Dataset,1);
Score = zeros(1,clusternumber);
for j=1:clusternumber
[cluster_assignments,centroids] = kmeans(Dataset,j);
distance_within=zeros(dataset_len,1);
distance_between=Inf(dataset_len,j);
for i=1:dataset_len
for jj=1:j
boo=cluster_assignments==cluster_assignments(i);
Xsamecluster=Dataset(boo,:);
if size(Xsamecluster,1)>1
distance_within(i)=sum(sum((Dataset(i,:)-Xsamecluster).^2,2))/(size(Xsamecluster,1)-1);
end
boo1= cluster_assignments~=cluster_assignments(i);
Xdifferentcluster=Dataset(boo1 & cluster_assignments ==jj,:);
if ~isempty(Xdifferentcluster)
distance_between(i,jj)=mean(sum((Dataset(i,:)-Xdifferentcluster).^2,2));
end
end
end
minavgDBetween = min(distance_between, [], 2);
silh = (minavgDBetween - distance_within) ./ max(distance_within,minavgDBetween);
Score(j) =mean(silh);
end
end
##### 4 commentairesAfficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens
Arka le 10 Mar 2023
Modifié(e) : Arka le 10 Mar 2023
You can use parfor as well, but my implementation took up ~10 seconds less time than parfor. I assume the pool of workers takes some extra time to get created.
I have altered the code to add the stopwatch timer like so:
tic
submit(job); % run the job
wait(job); % wait for the job to end
results = fetchOutputs(job); % get the results
toc
This will print the time elapsed between the start of the timer and end of the timer. You can do the same thing for the sequential processing function call, and then compare the times.
SH le 10 Mar 2023
@Arka But parallel take more time then actual which is not correct.

Connectez-vous pour commenter.

Raymond Norris le 10 Mar 2023
As @Arka mentioned, consider using the Parallel Computing Toolbox. In your case, I would suggest looking at rewriting your outer for-loop as a parfor.
To begin with, let's write a helper script
function t = run_me
clusternumber = 6;
t0 = tic;
Scorefunction(Dataset,clusternumber);
t = toc(t0);
I'll run it once as is, then modify Scorefunction as such
for j=1:clusternumber
to
parfor j=1:clusternumber
Before I run it a second time, I started a parallel pool of 6 workers (the size of the pool should be a factor of the size of the parfor loop -- 6 in this case).
Here's my complete run, showing a speed up of 3.2 (the actual tic/toc should be directly around the parfor-loop, but I didn't want to modify your code more than I needed to).
>> t = run_me
t =
15.5585
>>
>> parpool('local',6);
Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 6).
>> tparallel = run_me
tparallel =
4.8057
>>
>> t/tparallel
ans =
3.2375
##### 0 commentairesAfficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

### Catégories

En savoir plus sur Startup and Shutdown dans Help Center et File Exchange

R2022b

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by