Effacer les filtres
Effacer les filtres

Vectorizing a simple accumulation?

1 vue (au cours des 30 derniers jours)
Ephedyn
Ephedyn le 20 Sep 2012
I have something like a sparse array, whose first member is always nonzero, and I want to replace each zero element with the nearest non-zero element right before it. For example:
matrixData = [1.3; 0; 0; 0; 4.2; 0; 0; 1.5; 0; 0; 0; 0];
should become
matrixData = [1.3; 1.3; 1.3; 1.3; 4.2; 4.2; 4.2; 1.5; 1.5; 1.5; 1.5; 1.5];
I am currently using a loop:
emptyRows = (matrixData ==0);
for i = 2:length(matrixData)
if emptyRows(i)
matrixData(i) = matrixData(i-1);
end
end
This is the performance bottleneck on my function, and it becomes very slow as I deal with extremely long arrays, and I can't think of a way to speed it up. (Can't parallelize it because the elements are non-independent.) Is there a way to vectorize this using accumarray or anything similar?
Thanks!

Réponse acceptée

Sean de Wolski
Sean de Wolski le 20 Sep 2012
Modifié(e) : Sean de Wolski le 20 Sep 2012
matrixData = [1.3; 0; 0; 0; 4.2; 0; 0; 1.5; 0; 0; 0; 0];
idxk = find(matrixData);
idxr = cumsum(logical(matrixData));
matrixData = matrixData(idxk(idxr));
One of many ways...
  1 commentaire
Ephedyn
Ephedyn le 20 Sep 2012
Thanks a lot! This solved my problem and is amazingly powerful. I wish I could accept both yours and Jan's answers for credit, as I had use for both.

Connectez-vous pour commenter.

Plus de réponses (1)

Jan
Jan le 20 Sep 2012
Modifié(e) : Jan le 20 Sep 2012
If your vectors are really large, try a Mex function:
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
mwSize n, i;
double *X, q, *Y;
n = mxGetNumberOfElements(prhs[0]);
plhs[0] = mxCreateDoubleMatrix(n, 1, mxREAL);
X = mxGetPr(prhs[0]);
Y = mxGetPr(plhs[0]);
q = mxGetNaN();
for (i = 0; i < n; i++) {
if (X[i] != 0.0) {
q = X[i];
}
Y[i] = q;
}
return;
}
The M-version needs some large temporary arrays:
1. t1 = find(matrixData)
2. t2 = logical(matrixData))
3. t3 = cumsum(t2)
4. t4 = idxk(idxr)
Therefore the C-method should have a great advantage.
This function can be parallelized: Use two additional inputs as inital and final index. Skip the inital phase until the 2st non-zero is found instead of inserting NaNs. Proceed after the final index until the next non-zero element as long a the vector length is not exceeded. This should scale very well with the number of cores.
Depending on the processor, this could be faster than the IF method:
int m;
for (i = 0; i < n; i++) {
m = (X[i] == 0);
q = X[i] * m + q * (m - 1);
Y[i] = q;
}
[EDITED] No, avoiding the IF is some percent slower. Some percent faster:
for (i = 0; i < n; i++) {
if (X[i] == 0) {
Y[i] = Y[i - 1];
} else {
Y[i] = X[i];
}
}
  1 commentaire
Ephedyn
Ephedyn le 20 Sep 2012
As above, I ended up implementing your solution in the production code though I had to debug in the command window (the actual function is a bit more complicated) using Sean's response. I'll really like to give my deepest gratitude to both of you and wish I could give both credit for answering my question. Thanks aplenty!

Connectez-vous pour commenter.

Catégories

En savoir plus sur Matrix Indexing dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by