Is there a faster way of splitting a cell array into numeric array while preserving NaN?

2 vues (au cours des 30 derniers jours)
Greetings,
I am trying to split a set of data into rows and columns of numeric data that will preserve the position of empty data (as NaN or anything similar).
The input data is a cell array with rows of strings. The columns are delimited by a semi-colon ' ; '. The first 8 columns are filled with garbage data and there are many trailing columns with no data at all. I even sometimes have rows with no data. The attached data sample is just 4,000 rows long but I actually have datasets that have between 50,000 and 300,000 rows.
I have been using the code below but the str2double step is incredibly slow. Can anyone offer an alternative approach that can cut down on the processing time?
% split data by the ' ; ' separator
data = cellfun(@(x) split(x,';'),data,'UniformOutput',false);
% get rid of preceding garbage data in columns 1 to 8
data = cellfun(@(x) x(9:end),data,'UniformOutput',false);
% convert data into double. This step is incredibly slow
data = cellfun(@str2double,data,'UniformOutput',false);
% example of next operations I wish to perform on this data
data_a = cellfun(@(x) x(1:2:end),data,'UniformOutput',false);
data_b = cellfun(@(x) x(2:2:end),data,'UniformOutput',false);
Thank you in advance for any help
  3 commentaires
Alex Wolf
Alex Wolf le 22 Août 2019
This offers a fairly good improvement to my code. Thank you for your suggestion.
Adam Danz
Adam Danz le 22 Août 2019
Thank you for the feedback. I've never tried the function myself.

Connectez-vous pour commenter.

Réponse acceptée

TADA
TADA le 22 Août 2019
Modifié(e) : TADA le 22 Août 2019
try this
endsWithSemicolon = cellfun(@(s) endsWith(s, ';'), data);
x = cellfun(@(s) textscan(s, '%f', 'Delimiter', ';', 'EmptyValue', nan(), 'Whitespace', ' *\n\t\r\b'), data);
x = cellfun(@(a) a(9:end), x, 'UniformOutput', false);
x(endsWithSemicolon) = cellfun(@(a) [a; nan], x(endsWithSemicolon), 'UniformOutput', false);
  4 commentaires
Adam Danz
Adam Danz le 23 Août 2019
Modifié(e) : Adam Danz le 23 Août 2019
teamwork +1
:D

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Data Type Identification dans Help Center et File Exchange

Produits


Version

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by