How to reduce large Computational time

3 vues (au cours des 30 derniers jours)
Navdeep Sony
Navdeep Sony le 11 Mar 2017
Commenté : Navdeep Sony le 13 Mar 2017
Please find the code attached below. Also please find three .mat files that have been attached. The problem is that the code takes very long time to execute. Kindly help what can be done to reduce time. Thanks.
clc,clear;
load('E');
load('R');
load('P');
sparse=zeros(14596,2209);
final=zeros(144,2209);
parfor ii=1:size(tem4,2)
dictionary=tem2;
t=5;
s=tem4(:,ii);
r=s;
atoms=zeros(size(dictionary,1),size(dictionary,2));
coefs=zeros(size(dictionary,2),1);
%Normalize the dictionary
for index=1:size(dictionary,2)
dictionary(:,index)=dictionary(:,index)./norm(dictionary(:,index));
end
D=dictionary;
% plotv(dictionary);
% hold on
% plotv(s);
index=[];
while(t>1e-15 && sum(dictionary(:)~=0))%Process while (Eucledian norm > 10^-15)
inner_product=dictionary'*r; %Dot Product
% inner_product=dot(dictionary,r');
[m,ind]=max(abs(inner_product));
index=[index ind];
atoms(:,ind)=dictionary(:,ind); %Select atom which has maximum inner product
dictionary(:,ind)=0;
x=(atoms(:,index)'*atoms(:,index))\(atoms(:,index)'*r);
coefs(index)=x;
r=r-atoms(:,index)*x;
t=norm(r);
% plotv(r);
% hold on
end
sparse(:,ii)=coefs;
sig=D*coefs;
ff=uint8(((((max(s))-min(s))/((max(sig)-min(sig))))*(sig-min(sig)))+min(s));
final(:,ii)=ff;
end

Réponse acceptée

Jan
Jan le 12 Mar 2017
Modifié(e) : Jan le 12 Mar 2017
function final = testfcn
% data1 = load('E'); % Not used
data2 = load('R');
data3 = load('P');
% tem1 = data1.tem1;
tem2 = data2.tem2;
tem4 = data3.tem4;
sparse = zeros(14596, 2209);
final = zeros(144, 2209, 'uint8');
tic
% Normalize the dictionary - outside the ii loop:
D = zeros(size(tem2));
for index = 1:size(D, 2)
D(:,index) = tem2(:,index) ./ norm(tem2(:,index));
% v = dictionary(:, index);
% dictionary(:,index) = v ./ sqrt(v' * v); % Not exactly the same!!!
end
warning('OFF', 'MATLAB:nearlySingularMatrix');
parfor ii = 1:15 % size(tem4,2)
t = 5;
s = tem4(:, ii);
r = s;
atoms = zeros(size(D));
coefs = zeros(size(D, 2),1);
dictionary = D;
index = [];
while t > 1e-15 && any(dictionary(:))
% inner_product = dictionary' * r; % Dot Product
% [m, ind] = max(abs(inner_product));
ind = GetMaxIndex(r, dictionary);
index = [index, ind]; % Ugly: growing vector!
atoms(:, ind) = dictionary(:, ind); % Select maximum atom
dictionary(:, ind) = 0;
x = (atoms(:, index).' * atoms(:, index)) \ (atoms(:, index).' * r);
r = r - atoms(:, index) * x;
% v = atoms(:, index); % Not exactly the same!!!
% x = (v.' * v) \ (v.' * r);
% r = r - v * x;
t = norm(r);
end
coefs(index) = x; % Outside the loop
sparse(:,ii) = coefs;
sig = D * coefs;
final(:, ii) = (((max(s)) - min(s)) / (max(sig) - min(sig))) * ...
(sig - min(sig)) + min(s);
end
warning('ON', 'MATLAB:nearlySingularMatrix');
toc
And GetMaxIndex is a C-mex function:
// Author: Jan Simon 2017, License: CC BY-SA 3.0
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
// GetMaxIndex(r, dict)
// [~, ind] = max(abs(dict' * r));
double *r, *dict; // Inputs
size_t nr, ndict;
double *rEnd, *s, dot, maxValue; // Working
size_t i, maxIndex;
r = mxGetPr(prhs[0]); // Obtain inputs
nr = mxGetNumberOfElements(prhs[0]);
rEnd = r + nr;
dict = mxGetPr(prhs[1]);
ndict = mxGetN(prhs[1]);
maxValue = 0; // Calculation
maxIndex = 0;
for (i = 0; i < ndict; i++) {
dot = 0.0; // Dot product
for (s = r; s < rEnd; dot += *s++ * *dict++) ; // empty loop
dot = fabs(dot);
if (dot > maxValue) { // Remember maximum index
maxValue = dot;
maxIndex = i;
}
}
plhs[0] = mxCreateDoubleScalar((double) (maxIndex + 1));
return;
}
Please note the 2 comments "Not exactly the same!!!". I've commented the sections out, although they are faster and differ in the magnitude of eps() only in the resutls. But this changes the output final already, which means, that your algorithm is instable: The result depends on tiny rounding effects. Does this satisfy your needs?
For the first 15 iterations for ii this needs 10.8 sec instead of 31.18 sec - R2016b/64/Win7 without Parallel Computing Toolbox. The full problem will need 150 times longer divided by the number of cores.
  1 commentaire
Navdeep Sony
Navdeep Sony le 13 Mar 2017
Thanks Jan. Your suggestions are really helpful though I have not tried mex files but surely will now.

Connectez-vous pour commenter.

Plus de réponses (1)

Jan
Jan le 11 Mar 2017
Modifié(e) : Jan le 12 Mar 2017
  • Start with replacing the inefficient sum(dictionary(:)~=0) to:
while(t>1e-15 && any(dictionary(:)))
This works, if dictionary does not contain negative values. Is this assumption correct? This saves 60% processing time in my R2016b/64/Core2Duo without Parallel Computing Toolbox.
  • Slightly faster for the normalization:
for index = 1:size(dictionary, 2)
v = dictionary(:, index);
dictionary(:, index) = v ./ sqrt(v' * v);
end
To my surprise this changes the result, although the changes in the variable dictionary are in the maginitude of eps()! This means, that your algorithm is instable. Keep care.
Most of all the normalization can be moved before the loop instead of calculating it repeatedly.
  • Create "final" with the correct type, if you store uint8 values in it:
final = zeros(144, 2209, 'uint8');
  • But now inspect the loop:
while t > 1e-15 && any(dictionary(:)) % Process while norm > 1e-15
inner_product = r' * dictionary; % Dot Product
[m, ind] = max(abs(inner_product));
index = [index, ind];
atoms(:, ind) = dictionary(:, ind); % Select maximum atom
dictionary(:, ind) = 0;
v = atoms(:, index);
x = (v' * v) \ (v' * r); % Abbreviated, further 10% faster
coefs(index) = x;
r = r - v * x;
t = norm(r);
end
The bottleneck is the dot product. About 50% of the columns are set to 0 in average, such that the code might be twice as fast, when these columns are ignored in the search of the maximum. This can be done in a C-Mex function easily. But do you have a C-compiler installed?
  • Move "coefs(index) = x;" behind the loop, because only the last value matters.
[EDITED] It is less efficient than I thought to ignore the columns set to 0 before. I post my final version in a second answer to increase the readability.

Catégories

En savoir plus sur Logical dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by