can someone suggest to make this code faster
3 views (last 30 days)
Show older comments
Can someone suggest me a trick to make this code faster.
"all_DOY" = 1000*15000*23 ; %(i,j,k) such that k dimension is the day of the year whic is random, while (i,j) are my pixel location starting from (1,1) to (1000,15000). The values of day of the year (DOY) in the kth dimension rranges range from -1 to 366
"all_variable" = 1000*15000*23; This is my variable value for (i,j,k) for the (i,j) position and k dimension i.e., day of the year in the "all_DOY" matrix. The values in the kth dimension range from -3000 to 15000
all_variable_new = 1000*15000*366
I'm trying to update the "all_variable_new" such that, my k dimension is from 1:366 (days of the year). for each day of the year obtained from "all_DOY" matrix, the variable value is updated in the "all_variable_new" for that respective (i,J,k) from "all_variable"
all_variable = load('all_variable.mat');
all_DOY = load('all_DOY.mat');
for ii =1: size(all_DOY,1) % for each ii
for jj = 1:size(all_DOY,2) % for each jj
idx = squeeze(all_DOY(ii,jj,:)); % sequeez k'th dimension. idk is the DOY which is also the position value in all_variable_new
variable = squeeze(all_variable(ii,jj,:)); % obtain the variable values
variable(idx==-1)=; % if -1, remove
idx(idx==-1)=; % if -1 then remove the values
all_variable_new(ii,jj,:) = NaN; % if there are no idx values then replace with NAN
for mm = Ind
all_variable_new(ii,jj,mm) = variable; % for the day of the year values obtained, replace those positions with variable values.
clear idx Ind mm
Guillaume on 10 Mar 2020
Edited: Guillaume on 10 Mar 2020
If I understood correctly:
pixels_dayofyear = load('all_doy.mat');
pixels_temperature = load('all_variable.mat');
assert(isequal(size(pixels_dayofyear), size(pixels_temperature)), 'Size of matrices doesn''t match!');
%sort each pixel (row, column) temperature (pages) according to the corresponding day of year.
%temperature for invalid days of year (-1) are all replaced by 0 at the end of the page.
pixels_dayofyear(pixels_dayofyear == -1) = NaN; %replace by a value that is sorted last. Could be Inf instead of NaN
[~, order] = sort(pixels_dayofyear, 3); %sort the day of year of each pixel. Get the new order
[rows, cols, ~] = ndgrid(1:size(pixels_dayofyear, 1), 1:size(pixels_dayofyear, 2), 1:size(pixels_dayofyear, 3)); %create row and column indices replicated along the correct dimensions for sub2ind
newpixels_temperature = pixels_temperature(sub2ind(size(pixels_temperature), rows, cols, order)); %reorder temperature according to the sorting of the days
newpixels_temperature(isnan(pixels_dayofyear)) = 0; %replace invalid values by 0
Guillaume on 13 Mar 2020
Edited: Guillaume on 13 Mar 2020
"While interpolating the data, it is required to have all the data even if it is same DOY, instead of mean." I'm not saying it doesn't exist, but I'm not aware of an interpolation method that can cope with two different values at the same point.
"dividing it into 10 rows, makes it hard to keep count of data blocks" I don't see why it would be hard. It's certainly an easy way to get rid of memory issue. At the same time, it also makes it possible to parallelise the processing (but beware of memory usage!) since the processing of the blocks is completely independent of each other. On a supercomputer, I assume that would be a better use of the resources. So, yes, Instead of splitting by year, I'd split by satellite coordinates and process blocks of size MxNx(10*23) -> MxNx366 where M and N are such that the data fits comfortably in memory.
Addendum: Assuming than M and N are reasonable so that the below fits in memory, then this is how I'd implement the transformation from the MxNx(10*23) to MxNx366 matrix. This directly interpolates the MxNx(10*23) data without ever creating the matrix of 0s:
%pixels_dayofyear, a M x N x day matrix indicating the day of measurement of the respective location in pixels_temperature
%pixels_temperature, a M x N x day matrix indicating the temperature on the respective day in pixels_dayofyear
[M, N, days] = size(pixels_dayofyear);
[inrows, incols, ~] = ndgrid(1:M, 1:N, 1:days); %create matrices of locations for each temperature
isvalid = pixels_dayofyear > -1
temperature_interpolant = scatteredinterpolant(M(isvalid), N(isvalid), pixels_dayofyear(isvalid), pixels_temperature(isvalid)); %create interpolant using locations and days as coordinates
%the line below creates 3 double matrices of size M x N x 366. They will use a lot of memory for large M and N!
[qrows, qcols, qday] = ndgrid(1:M, 1:N, 1:366); %matrices of all query locations and days
interpolated_temperature = reshape(temperature_interpolant(qrows, qcols, qday), M, N, 366); %interpolate at all query points and reshape into 3D matrix
Beware that qrows, qcols and qday will require M*N*366*8 bytes of memory. ScatteredInterpolant only accepts doubles so there's no way to save memory there by using a smaller type. However, these 3 arrays can be used unchanged for all the blocks (assuming they're all the same size).
More Answers (0)
Find more on Linear Algebra in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!