Big data percentile calculation

1 vue (au cours des 30 derniers jours)
David Santos
David Santos le 8 Août 2019
Commenté : David Santos le 8 Août 2019
Hi,
I have a large set (30.000) of mat files each one of them containing a 4x1 cell array of 1483x2824 double, 4 matrix for each file ~= 30-40 MB
These are timeseries files representing simulations over 3 months.
I want to calculate the percentile of all this time series files but is too much memory for my computer because I need to load all the files, any clue on how to solve this? I'm working on a server with 20cores/40 threads and 256GB of memory.
I heard about this algorithm (P-square) but I couldn't find something similar inside matlab.
All the best

Réponses (1)

Steven Lord
Steven Lord le 8 Août 2019
See some of the tools and techniques available in MATLAB for working with Big Data, data that's too big to fit in memory. Many functions are supported on tall arrays.
  2 commentaires
David Santos
David Santos le 8 Août 2019
Modifié(e) : David Santos le 8 Août 2019
Thanks!
What would you recommend if I want to convert my 4xcell array files in just one?
David Santos
David Santos le 8 Août 2019
Ok, I'm trying using a fileDatastore and tall arrays:
-After all definitions I have the tall array t:
function data=loadPrc(filename)
data=load(filename);
ind=strfind(filename,'/');
data=data.(strcat('l',filename(ind(end)+1:end-4-7)));
data=data{1};
end
ds=fileDatastore('matBorrame','ReadFcn',@loadPrc,'FileExtensions','.mat')
t=tall(ds)
t =
4×1 tall cell array
{1483×2824 double}
{1483×2824 double}
{1483×2824 double}
{1483×2824 double}
My problem is that now the prctile calculation gives a format error:
gather(prctile(t,90,3))
Evaluating tall expression using the Parallel Pool 'local':
- Pass 1 of 1: 0% complete
Evaluation 0% complete
Error using tall/prctile (line 48)
Argument 1 to PRCTILE must be one of the following data types: numeric.
Learn more about errors encountered during GATHER.
That's because t should be in the format (1483x2824x4) but I can't reshape or permute a tall array, any clue on how to solve this¿?
All the best

Connectez-vous pour commenter.

Catégories

En savoir plus sur Large Files and Big Data dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by