How to slice each string in a string array without using for loop

27 vues (au cours des 30 derniers jours)
YANAN ZHU
YANAN ZHU le 18 Sep 2018
Modifié(e) : Cedric le 19 Sep 2018
For a string array, for example,
celldata =
3×1 cell array
{'2018-12-12'}
{'2018-11-05'}
{'2018-09-02'}
Is there array operation (i.e. without using for loop) to extract the months from each cell and convert them to a 3*1 numeric matrix, which should be [12;11;09]. I don't want to use for loop because it was too slow.

Réponse acceptée

Cedric
Cedric le 19 Sep 2018
Modifié(e) : Cedric le 19 Sep 2018
If you favor performance over readability/maintainability, you can build an approach around the following:
buffer = vertcat( dates{:} ) ;
months = (buffer(:,6:7) - '00') * [10;1] ;
where dates is a cell array of date strings ( celldata in your example).
Here is the benchmark:
% Build "large" data set.
N = 1e4 ;
dates = repmat( {'2018-12-12'; '2018-11-05'; '2018-09-02'}, N, 1 ) ;
% Basic FOR loop, STR2DOUBLE.
tic ;
n = numel( dates ) ;
months_forStr2double = zeros( n, 1 ) ;
for k = 1 : n
months_forStr2double(k) = str2double( dates{k}(6:7) ) ;
end
fprintf( 'Basic FOR, STR2DOUBLE : %.3fs\n', toc ) ;
% Basic FOR loop, STR2NUM.
tic ;
n = numel( dates ) ;
months_forStr2num = zeros( n, 1 ) ;
for k = 1 : n
months_forStr2num(k) = str2num( dates{k}(6:7) ) ;
end
fprintf( 'Basic FOR, STR2NUM : %.3fs\n', toc ) ;
% Basic FOR loop, SSCANF.
tic ;
n = numel( dates ) ;
months_forScanf = zeros( n, 1 ) ;
for k = 1 : n
months_forScanf(k) = sscanf( dates{k}(6:7), '%d' ) ;
end
fprintf( 'Basic FOR, SSCANF : %.3fs\n', toc ) ;
% CELLFUN (hidden FOR), SSCANF.
tic ;
months_cellfun = cellfun( @(date) sscanf( date(6:7), '%d' ), dates ) ;
fprintf( 'CELLFUN, SSCANF : %.3fs\n', toc ) ;
% REGEXP
tic ;
months_regexp = str2double( regexp( dates,'(?<=-)\d+(?=-)','match','once' )) ;
fprintf( 'REGEXP : %.3fs\n', toc ) ;
% CELL2MAT, STR2NUM
tic ;
chardata = cell2mat( dates ) ;
months_cell2matStr2num = str2num( chardata(:,6:7) ) ;
fprintf( 'CELL2MAT, STR2NUM : %.3fs\n', toc ) ;
% DATETIME
tic ;
dt = datetime( dates) ;
months_datetime = month( dt ) ;
fprintf( 'DATETIME : %.3fs\n', toc ) ;
% SSCANF
tic ;
months_sscanf = sscanf([dates{:}],'%*4d-%2d-%*2d') ;
fprintf( 'SSCANF : %.3fs\n', toc ) ;
% EXTRACTBETWEEN
tic ;
months_extractBetween = extractBetween( dates, '-', '-' ) ;
months_extractBetween = cellfun( @str2double, months_extractBetween ) ;
fprintf( 'EXTRACTBETWEEN : %.3fs\n', toc ) ;
% Trick.
tic ;
buffer = vertcat( dates{:} ) ;
months_trick = (buffer(:,6:7) - '00') * [10;1] ;
fprintf( 'Trick: %.3fs\n', toc ) ;
% Check
disp( [isequal( months_forStr2num, months_forStr2double ), ...
isequal( months_forScanf, months_forStr2double ), ...
isequal( months_cellfun, months_forStr2double ), ...
isequal( months_regexp, months_forStr2double ), ...
isequal( months_cell2matStr2num, months_forStr2double ), ...
isequal( months_datetime, months_forStr2double ), ...
isequal( months_sscanf, months_forStr2double ), ...
isequal( months_extractBetween, months_forStr2double ), ...
isequal( months_trick, months_forStr2double )] ) ;
Output:
Basic FOR, STR2DOUBLE : 0.489s
Basic FOR, STR2NUM : 0.975s
Basic FOR, SSCANF : 0.356s
CELLFUN, SSCANF : 0.550s
REGEXP : 0.673s
CELL2MAT, STR2NUM : 0.015s
DATETIME : 0.201s
SSCANF : 0.023s
EXTRACTBETWEEN : 0.624s
Trick: 0.008s
1 1 1 1 1 1 1 1 1
  2 commentaires
Stephen23
Stephen23 le 19 Sep 2018
Modifié(e) : Stephen23 le 19 Sep 2018
sscanf does not require a loop, simply concatenate the char vectors and use an appropriate format string:
C = {'2018-12-12','2018-11-05','2018-09-02'}
sscanf([C{:}],'%*4d-%2d-%*2d')
Cedric
Cedric le 19 Sep 2018
Thanks Stephen, just added this to the benchmark!

Connectez-vous pour commenter.

Plus de réponses (4)

Christopher Wallace
Christopher Wallace le 19 Sep 2018
chardata = cell2mat(a);
numdata = str2num(chardata(:,6:7));

Paolo
Paolo le 18 Sep 2018
Modifié(e) : Paolo le 18 Sep 2018
str2double(regexp(celldata,'(?<=-)\d+(?=-)','match','once'))
ans =
12
11
9

Star Strider
Star Strider le 19 Sep 2018
Using datetime and its functions:
celldata = [{'2018-12-12'}
{'2018-11-05'}
{'2018-09-02'}];
dt = datetime(celldata);
M = month(dt)
M =
12
11
9

Akira Agata
Akira Agata le 19 Sep 2018
Another possible solution:
celldata = [{'2018-12-12'}
{'2018-11-05'}
{'2018-09-02'}];
M = extractBetween(celldata,'-','-');
M = cellfun(@str2double,M);

Catégories

En savoir plus sur Data Type Conversion dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by