Finding NaN and Missing values from a mat cell matrix
1 vue (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Jorge Luis Paredes Estacio
le 15 Août 2024
Commenté : Voss
le 15 Août 2024
Hello, I have obtained a global matrix from an analysis (which is attached here and it is a reduced matrix as it is exceeded the 5mb) and I would like to find the NaN and missing values for each case to sort out some issues in the code for those values before generating a more complex simulation analysis. As you will see in the mat file there are 70 columns with separate information and each row is identified by the 1st column as it is related to the unique event of my database. I would like to generate two tables with the following information:
1st table containing the summary information of NaN values values in the whole matrix (attched file, that it may not contain NaN values as I have to reduce the number of rows for exceeding the 5mb) where it provides their location based on the row (first column: date_event) that provides the date_event, the name of the station provided in colum 46, and the column of the variable that has the NaN value. For example:
matrix_NaN=['1985-03-03 22:47:08', 'CFLAN,', 'Rrup1'; '1997-02-19 18:25:14','CPLAT', 'Rx'; ..........]
2nd table containing the information of missing values like it was provided with the NaN values:
matrix_missing=['2003-08-26 21:11:35', 'CFLAN,' 'Rx'; '2003-08-26 21:11:35','CTRUJ', sigma_Rx'; ..........]
I would appreciate the help
1 commentaire
Stephen23
le 15 Août 2024
Why is this data inefficiently stored as lots and lots of scalar arrays inside a cell array?
Using one table would be much more efficient, and offer much easier ways to process the data.
Réponse acceptée
Voss
le 15 Août 2024
load('example_global.mat')
C = example_global
As you said, the cell array in the attached mat file doesn't have any NaNs, so I'm going to introduce some for testing/demonstration purposes, to show that the distinction between NaN and <missing> can be made:
C{5,5} = NaN; % introducing NaNs for testing/demonstration
C{100,60} = NaN;
Now, construct a cell array containing info about the NaN and missing values, with 3 columns (corresponding datetime value, "variable" name, and value - either NaN or <missing>):
[ridx,cidx] = find(cellfun(@(x)any(ismissing(x)),C));
lidx = sub2ind(size(C),ridx,cidx);
C_missing = [C(ridx,1) C(1,cidx).' C(lidx)];
disp(C_missing)
If you need to split it into two cell arrays, one for the NaNs and one for the <missing>s, you can do so like this:
nan_rows = cellfun(@isnumeric,C_missing(:,3));
matrix_NaN = C_missing(nan_rows,[1 2]);
matrix_missing = C_missing(~nan_rows,[1 2]);
disp(matrix_NaN)
disp(matrix_missing)
2 commentaires
Plus de réponses (0)
Voir également
Catégories
En savoir plus sur Multidimensional Arrays dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!