Rearrange cell content by groups
Afficher commentaires plus anciens
There's got to be a simple way to do this...
Have a cell array of data by date wherein there are N (or fewer) instances of M quantities for each date. I need to rearrange to reflect each of the M quantities for a given date on a single row. For example given
Date Fund Shares Value
___________ _____________________________________________ _______________ ______
30-Nov-2015 'American International Growth & Income Fund' '67.958 shares' 1979.6
30-Nov-2015 'American Mutual Fund' '54.930 shares' 1997.3
30-Nov-2015 'Capital Income Builder' '34.843 shares' 1997.2
30-Nov-2015 'Europacific Growth Fund' '41.528 shares' 1988.8
30-Nov-2015 'Income Fund of America Fund' '95.102 shares' 2001.9
31-Dec-2015 'American International Growth & Income Fund' '67.958 shares' 1936.6
31-Dec-2015 'American Mutual Fund' '54.930 shares' 1952.1
31-Dec-2015 'Capital Income Builder' '34.843 shares' 1964.2
31-Dec-2015 'Europacific Growth Fund' '41.528 shares' 1941.1
31-Dec-2015 'Income Fund of America Fund' '95.102 shares' 1975.3
31-Jan-2016 'American International Growth & Income Fund' '67.958 shares' 1847.1
31-Jan-2016 'American Mutual Fund' '54.930 shares' 1894.5
31-Jan-2016 'Capital Income Builder' '34.843 shares' 1939.6
31-Jan-2016 'Europacific Growth Fund' '41.528 shares' 1823
31-Jan-2016 'Income Fund of America Fund' '95.102 shares' 1930.4
...
30-Sep-2017 'American International Growth & Income Fund' '67.958 shares' 44.02
30-Sep-2017 'Europacific Growth Fund' '41.528 shares' 101.21
31-Oct-2017 'American International Growth & Income Fund' '67.958 shares' 44.3
31-Oct-2017 'Europacific Growth Fund' '41.528 shares' 103.35
30-Nov-2017 'American International Growth & Income Fund' '67.958 shares' 44.3
30-Nov-2017 'Europacific Growth Fund' '41.528 shares' 103.35
one observes there are five Funds for each of the first few dates but only two at the end. The desired output would be
Date Fund1 Shares1 Value1 Fund2 Shares2 Value2 ...
___________ _________________ _______________ ______ _________________ _______________ ____________ ...
30-Nov-2015 'American Int...' '67.958 shares' 1979.6 'American Mut...' '54.930 shares' 1997.3 ...
31-Dec-2015 'American Int...' '67.958 shares' 1936.6 'American Mut...' '54.930 shares' 1952.1 ...
...
30-Nov-2017 'American Int...' '67.958 shares' 44.3 '' '' '' ...
where I truncated the lines showing only the first two of five. Of course, there would be empty cells at the end where there are only the two of five still extant.
It's easy enough to get grouping indices; I just haven't figured out the simple way to process them to build the output array from the grouping numbers.
1 commentaire
dpb
le 23 Déc 2018
Réponse acceptée
Plus de réponses (1)
Hi dpb,
Assuming that you have a cell array (and that you are using this table type of output just for a display purpose), here is a "neither-too-elegant-nor-too-satisfactory" approach:
data = {'10/8/17','A',10,100; '10/8/17','B',8,30; ...
'11/8/17','E',17,70; '11/8/17','A',14,80; '11/8/17','C',5,110; ...
'12/8/17','B',9,50} ;
[groupId, dates] = findgroups( data(:,1) ) ;
% - Build row/col IDs in output array.
rowId = reshape( (groupId * ones(1,3)).', [], 1 ) ;
colId = arrayfun( @(n)1+(1:3*n), accumarray(groupId, 1), 'Unif', 0 ) ;
colId = [colId{:}].' ;
% - Build output array.
outData = cell( max(rowId), max(colId) ) ;
outData(:,1) = dates ;
outData(sub2ind( size(outData), rowId, colId )) = reshape( data(:,2:end).', [], 1 ) ;
where the input data is:
>> data
data =
6×4 cell array
{'10/8/17'} {'A'} {[10]} {[100]}
{'10/8/17'} {'B'} {[ 8]} {[ 30]}
{'11/8/17'} {'E'} {[17]} {[ 70]}
{'11/8/17'} {'A'} {[14]} {[ 80]}
{'11/8/17'} {'C'} {[ 5]} {[110]}
{'12/8/17'} {'B'} {[ 9]} {[ 50]}
and the output is:
>> outData
outData =
3×10 cell array
{'10/8/17'} {'A'} {[10]} {[100]} {'B' } {[ 8]} {[ 30]} {0×0 double} {0×0 double} {0×0 double}
{'11/8/17'} {'E'} {[17]} {[ 70]} {'A' } {[ 14]} {[ 80]} {'C' } {[ 5]} {[ 110]}
{'12/8/17'} {'B'} {[ 9]} {[ 50]} {0×0 double} {0×0 double} {0×0 double} {0×0 double} {0×0 double} {0×0 double}
Hoping that you'll get a more elegant approach ..
Happy new year!
Cedric
EDIT 1 : Using SPLITAPPLY can help to some extent:
>> outData = splitapply( @(x){reshape(x.',[],1).'}, data(:,2:end), groupId )
outData =
3×1 cell array
{1×6 cell}
{1×9 cell}
{1×3 cell}
but then we must implement padding for the concatenation.
7 commentaires
dpb
le 31 Déc 2017
Cedric
le 31 Déc 2017
I agree that there should be a builtin for advanced reshaping by group. I updated my code (attached) a little, using UNIQUE with the stable arg because (forcefully) FINDGROUPS sorts. It seems to output what you are after.
My pleasure dpb! Note that one the most elegant/concise ways to start is probably:
[~, dateId, groupId] = unique( data(:,1), 'stable' ) ;
outData = splitapply( @(x){reshape(x.',[],1).'}, data(:,2:end), groupId )
If you run this on data as defined in the M-File above, you will see that you get all the rows in a cell array (minus the first column that is trivial). Yet, we must then obviously pad this output before we can concatenate it (or at least perform some processing). Surprisingly, this takes several lines of code to perform, so overall I would not say that it is really cleaner!
Cheers!
Cedric
dpb
le 3 Jan 2018
I have the same feeling, there should be a better approach!
About SPLITAPPLY, I think that it is not intuitive because, in the beginning, we have a hard time guessing how the input array is split and to what the function passed as 1st arg is applied. A good way to check that out is to DISP it with no output arg:
>> splitapply( @disp, data(:,2:end), groupId )
'American International Growt…' '67.958 shares' [1.9796e+03]
'American Mutual Fund' '54.930 shares' [1.9973e+03]
'Capital Income Builder' '34.843 shares' [1.9972e+03]
'Europacific Growth Fund' '41.528 shares' [1.9888e+03]
'Income Fund of America Fund' '95.102 shares' [2.0019e+03]
'American International Growt…' '67.958 shares' [1.9366e+03]
'American Mutual Fund' '54.930 shares' [1.9521e+03]
'Capital Income Builder' '34.843 shares' [1.9642e+03]
'Europacific Growth Fund' '41.528 shares' [1.9411e+03]
'Income Fund of America Fund' '95.102 shares' [1.9753e+03]
'American International Growt…' '67.958 shares' [1.8471e+03]
'American Mutual Fund' '54.930 shares' [1.8945e+03]
'Capital Income Builder' '34.843 shares' [1.9396e+03]
'Europacific Growth Fund' '41.528 shares' [ 1823]
'Income Fund of America Fund' '95.102 shares' [1.9304e+03]
'American International Growt…' '67.958 shares' [ 44.0200]
'Europacific Growth Fund' '41.528 shares' [101.2100]
'American International Growt…' '67.958 shares' [ 44.3000]
'Europacific Growth Fund' '41.528 shares' [103.3500]
'American International Growt…' '67.958 shares' [ 44.3000]
'Europacific Growth Fund' '41.528 shares' [103.3500]
Seeing this, it becomes obvious why I used:
@(x){reshape(x.',[],1).'}
in my previous comment, and why I pass all columns but the first.
dpb
le 4 Jan 2018
Catégories
En savoir plus sur Calendar dans Centre d'aide et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
